Deploying Forgejo Runners On RKE2
Containers, on Containers, on Containers (sorta)
As usual I have an aversion to creating “pet” VMs, and as such I decided to deploy my Forgejo runners on RKE2. Sticking to the Rancher philosophy of “cattle not pets” (a philosophy I strongly agree with), I decided to go this route. Of course deploying Forgejo runners in Kubernetes comes with its own issues, mainly, unlike GitLab, Forgejo can not create pods in a cluster natively. The recommended way to deploy runners in Kubernetes is via DinD (Dock in Docker), this means my workflows will be in containers, in an DinD container (on RKE2), in a container (on Harvester, also RKE2). It really is containers all the way down!
Deploying the Runners
Thankfully the docs for deploying to Kubernetes give you an easy jumping off point, however they went with a deployment1 instead of a StatefulSet. I seem to recall GitLab also doing this, at least at some point in the past. There are a couple issues with this basic deployment, the biggest being, as deployments die and come back they will register as a whole new runner. Like GitLab in the past this will lead to innumerable dead runners that simply no longer exist.
Codeberg Deployment Manifest
---
apiVersion: v1
stringData:
token: your_registration_token
kind: Secret
metadata:
name: runner-secret
---
apiVersion: apps/v1ssssssssss
kind: Deployment
metadata:
labels:
app: forgejo-runner
name: forgejo-runner
spec:
replicas: 2
selector:
matchLabels:
app: forgejo-runner
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: forgejo-runner
spec:
restartPolicy: Always
volumes:
- name: docker-certs
emptyDir: {}
- name: runner-data
emptyDir: {}
initContainers:
- name: runner-config-generation
image: code.forgejo.org/forgejo/runner
command:
['forgejo-runner create-runner-file --instance $FORGEJO_INSTANCE_URL --secret $RUNNER_SECRET --connect']
env:
- name: RUNNER_SECRET
valueFrom:
secretKeyRef:
name: runner-secret
key: token
- name: FORGEJO_INSTANCE_URL
value: https://myforgejo.org
volumeMounts:
- name: runner-data
mountPath: /data
containers:
- name: runner
image: code.forgejo.org/forgejo/runner
command:
[
'sh',
'-c',
"while ! nc -z localhost 2376 </dev/null; do echo 'waiting for docker daemon...'; sleep 5; done; forgejo-runner daemon",
]
env:
- name: DOCKER_HOST
value: tcp://localhost:2376
- name: DOCKER_CERT_PATH
value: /certs/client
- name: DOCKER_TLS_VERIFY
value: '1'
volumeMounts:
- name: docker-certs
mountPath: /certs
- name: runner-data
mountPath: /data
- name: daemon
image: docker:dind
env:
- name: DOCKER_TLS_CERTDIR
value: /certs
securityContext:
privileged: true
volumeMounts:
- name: docker-certs
mountPath: /certs
Converting to a StatefulSet
Set podAntiAffinity
To start I converted the deployment to a StatefulSet and added a quick addition to spread the pods across the cluster, I don’t want more than one on a single node, so I added the following affinity:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app.kubernetes.io/name: forgejo-runner
Configure Runner for Internal CA
I should note I also run Forgejo on an internal cluster, this cluster uses an internal CA also, so the StatefulSet needed to be edited further to drop the VMs CA trust into the container. I am running my clusters on Rocky 9, so /etc/ssl/certs/ca-bundle.crt
on the host needs to be passed into all the StatefulSet’s containers (as RO) at /etc/ssl/certs/ca-certificates.crt
. The runner config also needed to be edited (config.yml
), to add a few items2 to the workflow containers, mainly an extra environment variable, and two options for mounting and allowing mounting of the SSL certs, the following variables were added to my config:
runner:
# Extra environment variables to run jobs.
envs:
NODE_EXTRA_CA_CERTS: "/etc/ssl/certs/ca-certificates.crt"
container:
# Whether to use privileged mode or not when launching task containers (privileged mode is required for Docker-in-Docker).
privileged: true
# And other options to be used when the container is started (eg, --volume /etc/ssl/certs:/etc/ssl/certs:ro).
options: "--volume /etc/ssl/certs/ca-certificates.crt:/etc/ssl/certs/ca-certificates.crt:ro"
# Volumes (including bind mounts) can be mounted to containers. Glob syntax is supported, see https://github.com/gobwas/glob
# You can specify multiple volumes. If the sequence is empty, no volumes can be mounted.
# For example, if you only allow containers to mount the `data` volume and all the json files in `/src`, you should change the config to:
# valid_volumes:
# - data
# - /etc/ssl/certs
# If you want to allow any volume, please use the following configuration:
# valid_volumes:
# - '**'
valid_volumes:
- /etc/ssl/certs/ca-certificates.crt
After doing so my runner appeared in Forgejo, but there were still some issues that needed to be resolved:
- It’s cool to have a StatefulSet, but now as I scale up and down this config will need to be manually edited for every new PVC created
- When the StatefulSet restarts a new runner is created in the API despite the PVC contents persisting
Define Default Workflow Image
When defining the config.yml
it is also a good time to set the default workflow image. The label structure here is a little odd, so the docs are important.
runner:
labels:
- docker:docker://myharbor.org/docker/library/node:24.9.0-alpine
A Cornucopia of Dead Runners
The issue with a new runner registering after every pod restart, turned out to be an issue with the initContainer, it was always registering a new runner even if the same PVC with a .runner
was already present. The command at the start of the initContainer appears to not bother checking if .runner
exists, so I needed to work around this. The solution was to write a simple bash script for the initContainer, there are two files that need to be accounted for: .runner
and config.yml
, I ended up with the following code:
#! /usr/bin/env bash
if [ ! -f "/data/.runner" ]; then
echo ".runner does not exist registering runner now"
echo "Registering with: $FORGEJO_INSTANCE_URL as $POD_NAME"
forgejo-runner register --no-interactive \
--name $POD_NAME \
--instance $FORGEJO_INSTANCE_URL \
--token $RUNNER_SECRET
else
echo ".runner exists no need to register"
fi
if [ ! -f "/data/config.yml" ]; then
echo "config.yml does not exist creating a config now"
forgejo-runner generate-config > config.yml
else
echo "config.yml exists no need to build a config"
fi
This script ensures the two files will be present, so as I scale up or down my StatefulSet I won’t end up with new runners in the Forgejo web UI’s “runners” section. I also added a name flag, so when the runner registers itself it will use the name of the pod as the runner name. To get the pod name I added the following snippet to the initContainer:
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
Finally I made sure to edit the “runner” container to use the config file explicitly, by default the upstream Deployment manifest does not have the runner daemon call the config explicitly. So I edited the command to add the ‘-c’ flag: -c /data/config.yml
. At this point I now have my runners deployed as a StatefulSet, with names that persist across the pod lifecycle, trust the internal CA, and automatically inject the CA into workflow pods.
After working through this I decided it would be best to have a single config.yml
that would be dropped onto the pods as a volumeMount. If I leave it up to the script to generate the config every new runner will need to be edited manually (at least the first time). I am far too lazy to do that, so I mounted the config.yml
as a volume from a configMap. Now as I scale StatefulSets up and down the config.yml
will have my custom SSL cert config added automatically.
The deployment is complete, I did notice though, workflow actions and images still pull from upstream Forgejo. I wanted to reduce my dependence on upstream as much as I can, so I also made sure to begin mirroring what I can locally. If you don’t care about the rest, I have full code snippets below in the “Complete Code Snippets” section below.
Mirroring Actions
One of the features I love about Forgejo is the repo mirroring, this was a big gripe I had with GitLab CE (I get they need to make money somehow), and because Forgejo has mirroring I can mirror the upstream action organization’s repos. I mirrored the repos I forsee myself actually using:
I created my own organization named “actions” that way everything was mirrored, even down to the organizations name. There are two ways to go from here:
- Simply call the workflow action using a full URL, like: “https://myforgejo.org/actions/checkout@v4"
- Edit Forgejo to use your own instance by default when using a short name, allowing you to keep using short names, like: “actions/checkout@v4”
Option one requires no further tweaking, but I wanted to take this further and setup my instance to search itself by default, not upstream. To do this, the forgejo-config
ConfigMap for your Forgejo deployment needs to be edited, you can simply add the following:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: forgejo-config
data:
actions: |
ENABLED = true
DEFAULT_ACTIONS_URL = https://myforgejo.org
After you add that (and restart your deployment) your instance will search itself for actions when using short codes.
Complete Code Snippets
StatefulSet Manifest
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: forgejo-runner
labels:
app.kubernetes.io/name: forgejo-runner
spec:
serviceName: forgejo-runner
replicas: 3
selector:
matchLabels:
app.kubernetes.io/name: forgejo-runner
template:
metadata:
labels:
app.kubernetes.io/name: forgejo-runner
spec:
initContainers:
- name: runner-config-generation
image: code.forgejo.org/forgejo/runner
command:
['bash', '/data/setup.sh']
env:
- name: FORGEJO_INSTANCE_URL
value: https://nope.lan
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: RUNNER_SECRET
valueFrom:
secretKeyRef:
name: runner-secret
key: token
volumeMounts:
- name: runner-data
mountPath: /data
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: True
- name: runner-configs
subPath: setup.sh
mountPath: /data/setup.sh
- name: runner-configs
subPath: config.yml
mountPath: /data/config.yml
containers:
- name: runner
image: code.forgejo.org/forgejo/runner
command:
[
'sh',
'-c',
"while ! nc -z localhost 2376 </dev/null; do echo 'waiting for docker daemon...'; sleep 5; done; forgejo-runner daemon -c /data/config.yml",
]
env:
- name: DOCKER_HOST
value: tcp://localhost:2376
- name: DOCKER_CERT_PATH
value: /certs/client
- name: DOCKER_TLS_VERIFY
value: '1'
volumeMounts:
- name: docker-certs
mountPath: /certs
- name: runner-data
mountPath: /data
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: True
- name: runner-configs
subPath: config.yml
mountPath: /data/config.yml
- name: daemon
image: docker:dind
env:
- name: DOCKER_TLS_CERTDIR
value: /certs
securityContext:
privileged: true
volumeMounts:
- name: docker-certs
mountPath: /certs
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: True
securityContext:
fsGroup: 1000
volumes:
- name: docker-certs
emptyDir: {}
- name: ssl-certs
hostPath:
path: /etc/ssl/certs/ca-bundle.crt
type: File
- name: runner-configs
configMap:
name: runner-configs
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
app.kubernetes.io/name: forgejo-runner
volumeClaimTemplates:
- apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: runner-data
spec:
accessModes:
- "ReadWriteOnce"
storageClassName: ceph-fs
resources:
requests:
storage: "50Gi"
persistentVolumeClaimRetentionPolicy:
whenDeleted: Retain
whenScaled: Retain
Runner config.yml
# Example configuration file, it's safe to copy this as the default config file without any modification.
# You don't have to copy this file to your instance,
# just run `forgejo-runner generate-config > config.yaml` to generate a config file.
#
# The value of level or job_level can be trace, debug, info, warn, error or fatal
#
log:
#
# What is displayed in the output of the runner process but not sent
# to the Forgejo instance.
#
level: info
#
# What is sent to the Forgejo instance and therefore
# visible in the web UI for a given job.
#
job_level: info
runner:
# Where to store the registration result.
file: .runner
# Execute how many tasks concurrently at the same time.
capacity: 1
# Extra environment variables to run jobs.
envs:
NODE_EXTRA_CA_CERTS: "/etc/ssl/certs/ca-certificates.crt"
# A_TEST_ENV_NAME_1: a_test_env_value_1
# A_TEST_ENV_NAME_2: a_test_env_value_2
# Extra environment variables to run jobs from a file.
# It will be ignored if it's empty or the file doesn't exist.
env_file: .env
# The timeout for a job to be finished.
# Please note that the Forgejo instance also has a timeout (3h by default) for the job.
# So the job could be stopped by the Forgejo instance if it's timeout is shorter than this.
timeout: 3h
# The timeout for the runner to wait for running jobs to finish when
# shutting down because a TERM or INT signal has been received. Any
# running jobs that haven't finished after this timeout will be
# cancelled.
# If unset or zero the jobs will be cancelled immediately.
shutdown_timeout: 3h
# Whether skip verifying the TLS certificate of the instance.
insecure: false
# The timeout for fetching the job from the Forgejo instance.
fetch_timeout: 5s
# The interval for fetching the job from the Forgejo instance.
fetch_interval: 2s
# The interval for reporting the job status and logs to the Forgejo instance.
report_interval: 1s
# The labels of a runner are used to determine which jobs the runner can run, and how to run them.
# Like: ["macos-arm64:host", "ubuntu-latest:docker://node:20-bookworm", "ubuntu-22.04:docker://node:20-bookworm"]
# If it's empty when registering, it will ask for inputting labels.
# If it's empty when executing the `daemon`, it will use labels in the `.runner` file.
labels: []
cache:
#
# When enabled, workflows will be given the ACTIONS_CACHE_URL environment variable
# used by the https://code.forgejo.org/actions/cache action. The server at this
# URL must implement a compliant REST API and it must also be reachable from
# the container or host running the workflows.
#
# See also https://forgejo.org/docs/next/user/actions/advanced-features/#cache
#
# When it is not enabled, none of the following options apply.
#
# It works as follows:
#
# - the workflow is given a one time use ACTIONS_CACHE_URL
# - a cache proxy listens to ACTIONS_CACHE_URL
# - the cache proxy securely communicates with the cache server using
# a shared secret
#
enabled: true
#
#######################################################################
#
# Only used for the internal cache server.
#
# If external_server is not set, the Forgejo runner will spawn a
# cache server that will be used by the cache proxy.
#
#######################################################################
#
# The port bound by the internal cache server.
# 0 means to use a random available port.
#
port: 0
#
# The directory to store the cache data.
#
# If empty, the cache data will be stored in $HOME/.cache/actcache.
#
dir: ""
#
#######################################################################
#
# Only used for the external cache server.
#
# If external_server is set, the internal cache server is not
# spawned.
#
#######################################################################
#
# The URL of the cache server. The URL should generally end with
# "/". The cache proxy will forward requests to the external
# server. The requests are authenticated with the "secret" that is
# shared with the external server.
#
external_server: ""
#
#######################################################################
#
# Common to the internal and external cache server
#
#######################################################################
#
# The shared cache secret used to secure the communications between
# the cache proxy and the cache server.
#
# If empty, it will be generated to a new secret automatically when
# the server starts and it will stay the same until it restarts.
#
# Every time the secret is modified, all cache entries that were
# created with it are invalidated. In order to ensure that the cache
# content is reused when the runner restarts, this secret must be
# set, for instance with the output of openssl rand -hex 40.
#
secret: ""
#
# The IP or hostname (195.84.20.30 or example.com) to use when constructing
# ACTIONS_CACHE_URL which is the URL of the cache proxy.
#
# If empty it will be detected automatically.
#
# If the containers or host running the workflows reside on a
# different network than the Forgejo runner (for instance when the
# docker server used to create containers is not running on the same
# host as the Forgejo runner), it may be impossible to figure that
# out automatically. In that case you can specifify which IP or
# hostname to use to reach the internal cache server created by the
# Forgejo runner.
#
host: ""
#
# The port bound by the internal cache proxy.
# 0 means to use a random available port.
#
proxy_port: 0
#
# Overrides the ACTIONS_CACHE_URL variable passed to workflow
# containers. The URL should generally not end with "/". This should only
# be used if the runner host is not reachable from the workflow containers,
# and requires further setup.
#
actions_cache_url_override: ""
container:
# Specifies the network to which the container will connect.
# Could be host, bridge or the name of a custom network.
# If it's empty, create a network automatically.
network: ""
# Whether to create networks with IPv6 enabled. Requires the Docker daemon to be set up accordingly.
# Only takes effect if "network" is set to "".
enable_ipv6: false
# Whether to use privileged mode or not when launching task containers (privileged mode is required for Docker-in-Docker).
privileged: true
# And other options to be used when the container is started (eg, --volume /etc/ssl/certs:/etc/ssl/certs:ro).
options: "--volume /etc/ssl/certs/ca-certificates.crt:/etc/ssl/certs/ca-certificates.crt:ro"
# The parent directory of a job's working directory.
# If it's empty, /workspace will be used.
workdir_parent:
# Volumes (including bind mounts) can be mounted to containers. Glob syntax is supported, see https://github.com/gobwas/glob
# You can specify multiple volumes. If the sequence is empty, no volumes can be mounted.
# For example, if you only allow containers to mount the `data` volume and all the json files in `/src`, you should change the config to:
# valid_volumes:
# - data
# - /etc/ssl/certs
# If you want to allow any volume, please use the following configuration:
# valid_volumes:
# - '**'
valid_volumes:
- /etc/ssl/certs/ca-certificates.crt
# overrides the docker client host with the specified one.
# If "-" or "", an available docker host will automatically be found.
# If "automount", an available docker host will automatically be found and mounted in the job container (e.g. /var/run/docker.sock).
# Otherwise the specified docker host will be used and an error will be returned if it doesn't work.
docker_host: "-"
# Pull docker image(s) even if already present
force_pull: false
# Rebuild local docker image(s) even if already present
force_rebuild: false
host:
# The parent directory of a job's working directory.
# If it's empty, $HOME/.cache/act/ will be used.
workdir_parent:
Custom Setup Script
#! /usr/bin/env bash
if [ ! -f "/data/.runner" ]; then
echo ".runner does not exist registering runner now"
echo "Registering with: $FORGEJO_INSTANCE_URL as $POD_NAME"
forgejo-runner register --no-interactive \
--name $POD_NAME \
--instance $FORGEJO_INSTANCE_URL \
--token $RUNNER_SECRET
else
echo ".runner exists no need to register"
fi
if [ ! -f "/data/config.yml" ]; then
echo "config.yml does not exist creating a config now"
forgejo-runner generate-config > config.yml
else
echo "config.yml exists no need to build a config"
fi
Sources
-
Running on Kubernetes
-
Forgejo Action Runners with Private HTTPS Certificates
-
Getting Forgejo Helm Deployment to Also Trust a Local Certificate Authority