<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Kubernetes on Zwindler's Reflection</title><link>https://blog.zwindler.fr/en/tags/kubernetes/</link><description>Recent content in Kubernetes on Zwindler's Reflection</description><generator>Hugo -- gohugo.io</generator><language>en</language><copyright>Licensed under CC BY-SA 4.0</copyright><lastBuildDate>Sat, 30 May 2026 12:00:00 +0200</lastBuildDate><atom:link href="https://blog.zwindler.fr/en/tags/kubernetes/index.xml" rel="self" type="application/rss+xml"/><item><title>zeropod v0.12.0: One Year Later, Does Scale-to-Zero Deliver?</title><link>https://blog.zwindler.fr/en/2026/05/30/zeropod-v0.12.0-one-year-later-does-scale-to-zero-deliver/</link><pubDate>Sat, 30 May 2026 12:00:00 +0200</pubDate><guid>https://blog.zwindler.fr/en/2026/05/30/zeropod-v0.12.0-one-year-later-does-scale-to-zero-deliver/</guid><description>&lt;img src="https://blog.zwindler.fr/2026/05/zeropod.webp" alt="Featured image of post zeropod v0.12.0: One Year Later, Does Scale-to-Zero Deliver?" /&gt;&lt;h2 id="what-is-zeropod"&gt;What is zeropod?
&lt;/h2&gt;&lt;p&gt;A little less than a year ago, I &lt;a class="link" href="https://blog.zwindler.fr/en/2025/06/20/zeropod-scale-to-zero-with-container-checkpointing/" &gt;published a first article&lt;/a&gt;, reviewing a tool called zeropod.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Zeropod is a Kubernetes runtime (more specifically a containerd shim) that automatically checkpoints containers to disk after a certain amount of time of the last TCP connection.&lt;/p&gt;
&lt;p&gt;While in scaled down state, it will listen on the same port the application inside the container was listening on and will restore the container on the first incoming connection.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;At the time, the stable version was v0.6.x, and I tested it on a k3s cluster. The results were mixed: it mostly worked, but with deal-breaking limitations for serious use (probes impossible, flaky behavior under load, checkpointing times a bit high for my taste).&lt;/p&gt;
&lt;p&gt;Since then, the project has evolved quite a bit (now v0.12.0), with promises of fixes and improvements. So I decided to give it another shot to see where things stand.&lt;/p&gt;
&lt;p&gt;One other change: I abandoned k3s for this test series, for reasons I&amp;rsquo;ll detail throughout the article.&lt;/p&gt;
&lt;h2 id="prerequisites"&gt;Prerequisites
&lt;/h2&gt;&lt;p&gt;This time, I used a freshly provisioned server at my favorite hosting provider:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;An Ubuntu 24.04.3 LTS (Noble) server, kernel 6.17.0-35 HWE, 7.7 Gi RAM, 100G disk&lt;/li&gt;
&lt;li&gt;A Kubernetes cluster set up with &lt;strong&gt;kubeadm&lt;/strong&gt;, flannel as CNI&lt;/li&gt;
&lt;li&gt;Vanilla containerd&lt;/li&gt;
&lt;li&gt;local-path-provisioner (Rancher) for default local storage&lt;/li&gt;
&lt;li&gt;No cert-manager or Ingress, keeping it simple this time&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="installation"&gt;Installation
&lt;/h2&gt;&lt;p&gt;I&amp;rsquo;ve installed kubeadm many times and you probably have too, so I won&amp;rsquo;t insult you with yet another tutorial.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Install kubeadm, kubelet, kubectl + containerd&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;sudo kubeadm init --pod-network-cidr&lt;span class="o"&gt;=&lt;/span&gt;10.42.0.0/16
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl taint nodes --all node-role.kubernetes.io/control-plane-
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Once the cluster is running, installing zeropod is trivial:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Apply the generic kustomize&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl apply -k https://github.com/ctrox/zeropod/config/generic
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Add a label to our single node&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl label node &amp;lt;your-node&amp;gt; zeropod.ctrox.dev/node&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Verify the pod is running&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl -n zeropod-system &lt;span class="nb"&gt;wait&lt;/span&gt; --for&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;condition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Ready pod -l app.kubernetes.io/name&lt;span class="o"&gt;=&lt;/span&gt;zeropod-node
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;That&amp;rsquo;s it. No special flags, no extra configuration. On kubeadm, kubelet is a native binary (&lt;code&gt;/usr/bin/kubelet&lt;/code&gt;) detected automatically by zeropod.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Note about k3s&lt;/strong&gt;: on k3s, the zeropod documentation differs since the config to use is &lt;code&gt;config/k3s&lt;/code&gt;. The kustomize adds a &lt;code&gt;-probe-binary-name=k3s&lt;/code&gt; flag on the initContainer so the shim knows the kubelet is embedded in the k3s binary. During my tests, even with this flag, the behavior wasn&amp;rsquo;t as expected (the socket tracker didn&amp;rsquo;t filter probes correctly). With the default config, the flag is on the initContainer but not on the manager. I suspected there was another component to patch, but I didn&amp;rsquo;t investigate further.&lt;/p&gt;
&lt;p&gt;The DaemonSet deploys the following images:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;ghcr.io/ctrox/zeropod-manager:v0.12.0&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ghcr.io/ctrox/zeropod-installer:v0.12.0&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;ghcr.io/ctrox/zeropod-criu:v4.2&lt;/code&gt; (CRIU has been updated since v0.6.x)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Let&amp;rsquo;s verify the zeropod runtimeClass is available:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get runtimeclass
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;NAME HANDLER AGE
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;zeropod zeropod 30m
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="first-test-nginx"&gt;First test: nginx
&lt;/h2&gt;&lt;p&gt;Like last time, let&amp;rsquo;s start with a simple nginx test. Deploy a basic pod:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;apps/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Deployment&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;nginx&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;matchLabels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;nginx&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;template&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;annotations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;zeropod.ctrox.dev/scaledown-duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;10s&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;nginx&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;runtimeClassName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;zeropod&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;nginx&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;nginx&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;containerPort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;80&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The important parts are the annotation &lt;code&gt;zeropod.ctrox.dev/scaledown-duration: 10s&lt;/code&gt; and &lt;code&gt;runtimeClassName: zeropod&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Shortly after deployment, zeropod detects the absence of traffic. The container is checkpointed. Looking at the manager logs:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;time&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;2026-05-29T14:26:37.269453861Z&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;level&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;INFO&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;msg&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;status event&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;container&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;nginx&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;pod&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;nginx-bench-7c65c8874-6pts7&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;phase&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;RUNNING&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;duration&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;0s&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then 10 seconds later:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;time&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;2026-05-29T14:26:47.465510937Z&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;level&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;INFO&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;msg&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;status event&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;container&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;nginx&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;pod&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;nginx-bench-7c65c8874-6pts7&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;phase&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;SCALED_DOWN&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;duration&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;191.259383ms&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The &lt;code&gt;duration&lt;/code&gt; field in the SCALED_DOWN log is the checkpoint time: &lt;strong&gt;191ms&lt;/strong&gt;. That&amp;rsquo;s already significantly better than the ~400ms from v0.6.x on k3s (likely thanks to an update, perhaps CRIU 4.2?).&lt;/p&gt;
&lt;h2 id="restore"&gt;Restore
&lt;/h2&gt;&lt;p&gt;When we send an HTTP request, the container wakes up:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;time&lt;/span&gt; curl http://&amp;lt;POD_IP&amp;gt;/ -s -o /dev/null -w &lt;span class="s2"&gt;&amp;#34;%{http_code} (%{time_total}s)&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;HTTP &lt;span class="m"&gt;200&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;0.101s&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;real 0m0.101s
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;user 0m0.003s
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;sys 0m0.003s
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;101ms&lt;/strong&gt; to restore nginx and serve a page. That&amp;rsquo;s comparable to the ~92ms from the original article.&lt;/p&gt;
&lt;p&gt;Nice new feature: under v0.6.x, &lt;code&gt;kubectl top pods&lt;/code&gt; returned an error for checkpointed pods:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# v0.6.x&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl top pods
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;error: Metrics not available &lt;span class="k"&gt;for&lt;/span&gt; pod default/php-xxx
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This bug was fixed in v0.9.0. Now pods in SCALED_DOWN state return &lt;code&gt;0m 0Mi&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl top pods
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;NAME CPU&lt;span class="o"&gt;(&lt;/span&gt;cores&lt;span class="o"&gt;)&lt;/span&gt; MEMORY&lt;span class="o"&gt;(&lt;/span&gt;bytes&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;nginx 0m 0Mi
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Much cleaner.&lt;/p&gt;
&lt;h2 id="testing-liveness-probes"&gt;Testing liveness probes
&lt;/h2&gt;&lt;p&gt;This was THE major limitation of v0.6.x: zeropod was incompatible with Kubernetes probes. If you added an httpGet liveness probe to your container, the probe would reset the scale-down timer, and your container would never go SCALED_DOWN. Result: probes unusable, making zeropod unusable in production.&lt;/p&gt;
&lt;h3 id="what-changed"&gt;What changed
&lt;/h3&gt;&lt;p&gt;Two fixes were made between v0.6.x and v0.12.0:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The activator&lt;/strong&gt; (the eBPF component that listens for traffic during SCALED_DOWN): it intercepts probes and replies 200 directly, without waking the container. That&amp;rsquo;s the &amp;ldquo;post scale-down&amp;rdquo; behavior.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The socket tracker&lt;/strong&gt; (the component that ignores connections during RUNNING state): since PR #72 (merged August 2025), zeropod can detect connections coming from kubelet and not count them as &amp;ldquo;real&amp;rdquo; traffic. That&amp;rsquo;s the &amp;ldquo;pre scale-down&amp;rdquo; behavior.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="real-world-test"&gt;Real-world test
&lt;/h3&gt;&lt;p&gt;I deployed nginx with an aggressive liveness probe (periodSeconds: 5) and &lt;code&gt;scaledown-duration: 10s&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;runtimeClassName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;zeropod&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;nginx&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;nginx&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;livenessProbe&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;httpGet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;/&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;80&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;periodSeconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;On k3s, this test gave me trouble: the socket tracker couldn&amp;rsquo;t filter probe connections, which kept resetting the scale-down timer indefinitely. The activator (which filters probes once scale-down has happened) worked fine though.&lt;/p&gt;
&lt;p&gt;On kubeadm, the result is immediate:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get pod -l &lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nginx-probe -o json &lt;span class="p"&gt;|&lt;/span&gt; jq -r &lt;span class="s1"&gt;&amp;#39;.items[0].metadata.labels[&amp;#34;status.zeropod.ctrox.dev/nginx&amp;#34;]&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;SCALED_DOWN
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The socket tracker correctly filters connections from the native kubelet. The &lt;code&gt;periodSeconds&lt;/code&gt; &amp;gt; &lt;code&gt;scaledown-duration&lt;/code&gt; rule I had to use on k3s is no longer necessary.&lt;/p&gt;
&lt;h2 id="a-word-on-changes-between-v06x-and-v0120"&gt;A word on changes between v0.6.x and v0.12.0
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s a summary of improvements between the two versions:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Point&lt;/th&gt;
&lt;th&gt;v0.6.x&lt;/th&gt;
&lt;th&gt;v0.12.0&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kubectl top pods&lt;/code&gt; while scaled-down&lt;/td&gt;
&lt;td&gt;❌ Error&lt;/td&gt;
&lt;td&gt;✅ &lt;code&gt;0m 0Mi&lt;/code&gt; (fix v0.9.0)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Checkpoint (nginx)&lt;/td&gt;
&lt;td&gt;~400ms&lt;/td&gt;
&lt;td&gt;~185ms (-54%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Restore (nginx)&lt;/td&gt;
&lt;td&gt;~92ms&lt;/td&gt;
&lt;td&gt;~99ms (stable)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CRIU&lt;/td&gt;
&lt;td&gt;v3.x&lt;/td&gt;
&lt;td&gt;v4.2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Checkpoint failure handling&lt;/td&gt;
&lt;td&gt;basic&lt;/td&gt;
&lt;td&gt;metrics + events (v0.11.0)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Configurable proxy timeouts&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ (v0.11.0)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inter-node migration&lt;/td&gt;
&lt;td&gt;basic&lt;/td&gt;
&lt;td&gt;improved + timeouts (v0.10.0)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Probes&lt;/td&gt;
&lt;td&gt;❌ Incompatible&lt;/td&gt;
&lt;td&gt;✅ Activator + socket tracker&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Interesting technical note about CRIU flags: zeropod removed the &lt;code&gt;--tcp-established&lt;/code&gt; option in September 2025. Previously, active TCP connections were saved and restored with the container. Now, zeropod uses &lt;code&gt;--tcp-skip-in-flight&lt;/code&gt; (when runc &amp;gt;= 1.3 supports it). Practical consequence: if your container has outgoing TCP connections at checkpoint time, they will be lost. You&amp;rsquo;ll need to re-establish them after restore.&lt;/p&gt;
&lt;h2 id="deploying-wordpress-the-realistic-use-case"&gt;Deploying WordPress (the &amp;ldquo;&amp;ldquo;&amp;ldquo;realistic&amp;rdquo;&amp;rdquo;&amp;rdquo; use case)
&lt;/h2&gt;&lt;p&gt;Alright, nginx is fine but not very representative. Let&amp;rsquo;s deploy a real app with PHP, Apache, and a database.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s reuse the WordPress manifest from the original article, without zeropod on MySQL:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;apps/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Deployment&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wordpress&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;matchLabels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wordpress&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;template&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;annotations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;zeropod.ctrox.dev/scaledown-duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;10s&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;zeropod.ctrox.dev/container-names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wordpress&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;zeropod.ctrox.dev/ports-map&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wordpress=80&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wordpress&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;runtimeClassName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;zeropod&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;initContainers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wait-for-mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql:8&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;sh&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- -&lt;span class="l"&gt;c&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="p"&gt;|&lt;/span&gt;&lt;span class="sd"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; until mysql -h mysql -u root -p${MYSQL_ROOT_PASSWORD} -e &amp;#34;SELECT 1&amp;#34;; do
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; echo &amp;#34;Waiting for MySQL...&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; sleep 3
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; done
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; mysql -h mysql -u root -p${MYSQL_ROOT_PASSWORD} -e &amp;#34;CREATE DATABASE IF NOT EXISTS wordpress;&amp;#34;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;MYSQL_ROOT_PASSWORD&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;verySecurePassword&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wordpress:latest&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wordpress&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;containerPort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;80&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;WORDPRESS_DB_HOST&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;WORDPRESS_DB_USER&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;root&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;WORDPRESS_DB_PASSWORD&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;verySecurePassword&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;WORDPRESS_DB_NAME&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;wordpress&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;MySQL (without zeropod):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Service&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3306&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;clusterIP&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;None&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nn"&gt;---&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;apps/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;StatefulSet&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;serviceName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;matchLabels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;template&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;labels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;app&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql:8&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;containerPort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;3306&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;MYSQL_ROOT_PASSWORD&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;verySecurePassword&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;volumeMounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;mountPath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;/var/lib/mysql&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;data&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;volumeClaimTemplates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;data&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;accessModes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;ReadWriteOnce&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;storage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;5Gi&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;WordPress (Apache + PHP) checkpoint:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;time&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;2026-05-29T14:33:35.915988635Z&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;phase&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;SCALED_DOWN&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;duration&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;313.286787ms&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Checkpoints take around 300ms. Restore is around 200ms:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;time&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;2026-05-29T14:34:31.56806589Z&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;phase&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;RUNNING&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nt"&gt;&amp;#34;duration&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;206.33005ms&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The first curl request confirms it:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nb"&gt;time&lt;/span&gt; curl http://&amp;lt;POD_IP&amp;gt;/ -s -o /dev/null -w &lt;span class="s2"&gt;&amp;#34;%{http_code} (%{time_total}s)&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;HTTP &lt;span class="m"&gt;302&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;0.212s&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;212ms, about &lt;strong&gt;twice faster&lt;/strong&gt; than the 454ms from the original article.&lt;/p&gt;
&lt;h2 id="the-killer-test-cascading-wordpress--mysql"&gt;The killer test: cascading WordPress + MySQL
&lt;/h2&gt;&lt;p&gt;In the original article, I attempted the ultimate test: putting &lt;strong&gt;both&lt;/strong&gt; pods (WordPress AND MySQL) with &lt;code&gt;runtimeClassName: zeropod&lt;/code&gt;, then sending an HTTP request to WordPress while both are checkpointed.&lt;/p&gt;
&lt;p&gt;The scenario goes like this:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;After a few seconds, WordPress goes SCALED_DOWN, MySQL goes SCALED_DOWN too&lt;/li&gt;
&lt;li&gt;A client sends an HTTP request to WordPress&lt;/li&gt;
&lt;li&gt;The WordPress activator intercepts the request and restores the WordPress container&lt;/li&gt;
&lt;li&gt;WordPress (Apache+PHP) starts up, executes PHP code requiring a MySQL connection&lt;/li&gt;
&lt;li&gt;WordPress connects to MySQL on port 3306&lt;/li&gt;
&lt;li&gt;The MySQL activator intercepts the connection and restores MySQL&lt;/li&gt;
&lt;li&gt;MySQL responds, WordPress generates the page, Apache sends back the HTTP response&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;True scale to zero that scales the entire app, not just the frontend/backend.&lt;/p&gt;
&lt;p&gt;Important note: I know that you&amp;rsquo;d probably &lt;strong&gt;never&lt;/strong&gt; want to scale down your database in production, but testing this shows that this approach (eBPF + CRIU) works beyond just scaling webservers to zero, which other tools on the market already do very well.&lt;/p&gt;
&lt;h2 id="on-k3s-no-luck-on-kubeadm-victory"&gt;On k3s: no luck, on kubeadm: victory
&lt;/h2&gt;&lt;p&gt;On k3s, I still haven&amp;rsquo;t managed to make this scenario work: WordPress would restore but wouldn&amp;rsquo;t respond on port 80. I spent quite some time trying to figure out why, without success.&lt;/p&gt;
&lt;p&gt;Same test, same zeropod version, but on kubeadm:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;curl http://&amp;lt;POD_IP&amp;gt;/ -s -o /dev/null -w &lt;span class="s2"&gt;&amp;#34;%{http_code} (%{time_total}s)&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;HTTP &lt;span class="m"&gt;302&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;0.224s&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;224ms.&lt;/strong&gt; Both pods went from SCALED_DOWN to RUNNING, the WordPress page loads. I repeated the test 5 times in a row:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cycle&lt;/th&gt;
&lt;th&gt;Time (curl)&lt;/th&gt;
&lt;th&gt;HTTP&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;229ms&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;192ms&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;227ms&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;230ms&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;211ms&lt;/td&gt;
&lt;td&gt;302&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The zeropod logs confirm both containers waking up:&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;WordPress: SCALED_DOWN → RUNNING
MySQL: SCALED_DOWN → RUNNING
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="what-really-changed-since-v06x"&gt;What really changed since v0.6.x
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Probes (finally!)&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This was my main grievance in the first article. Today, it&amp;rsquo;s resolved:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Before: impossible to use Kubernetes probes → zeropod unusable in any somewhat serious use case&lt;/li&gt;
&lt;li&gt;After: the activator intercepts probes in SCALED_DOWN (replies 200 without restore), and the socket tracker filters kubelet connections in RUNNING state&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Stability&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The flaky behavior (pod loss under simultaneous load) is gone. Where I had failures in the original article, sequential and simultaneous load tests all pass now.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Checkpoint gained ~50% speed (185ms vs 400ms). Restore too (200ms vs 454ms). Probably thanks to CRIU v4.2 and zeropod optimizations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;kubectl top pods&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This small detail was an eyesore — &lt;code&gt;kubectl top pods&lt;/code&gt; crashed on checkpointed pods. Fixed in v0.9.0.&lt;/p&gt;
&lt;h2 id="some-remaining-limitations"&gt;Some remaining limitations
&lt;/h2&gt;&lt;p&gt;I&amp;rsquo;d be dishonest if I said everything is perfect. Here&amp;rsquo;s what&amp;rsquo;s still problematic:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Since September 2025, zeropod no longer uses &lt;code&gt;--tcp-established&lt;/code&gt; for CRIU. This means if your container has outgoing TCP connections at checkpoint time, they will be lost. In practice, for a web server, this means reconnecting to the database after restore. With zeropod, the MySQL connection is re-established automatically (the current PHP request fails, the next one succeeds). This is a detail that may matter depending on your use case.&lt;/li&gt;
&lt;li&gt;I only tested WordPress (Apache + mod_php) and MySQL. Applications using websockets, streaming, or long-lived connections might behave differently.&lt;/li&gt;
&lt;li&gt;I had difficulties getting it to work properly on k3s.&lt;/li&gt;
&lt;li&gt;I observed a glitch (Apache segfault) on the first restore of a freshly created WordPress pod. It&amp;rsquo;s not reproducible after a normal checkpoint/restore cycle, but if you frequently redeploy your pods, you might encounter it.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="verdict"&gt;Verdict
&lt;/h2&gt;&lt;p&gt;One year later, zeropod seems to deliver on more of its promises:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Even though it wasn&amp;rsquo;t a dealbreaker, checkpoint/restore performance has improved significantly (~50% faster), which is always welcome&lt;/li&gt;
&lt;li&gt;Kubernetes probe support has been added, lifting what I considered the main blocker&lt;/li&gt;
&lt;li&gt;General stability is better&lt;/li&gt;
&lt;li&gt;The cascade test (WordPress + MySQL) works&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I&amp;rsquo;m still just as hyped about the idea of freezing a container and restoring it 10 seconds later like nothing happened. The magic of CRIU and eBPF combined is starting to materialize, after years of waiting.&lt;/p&gt;
&lt;p&gt;Would I put this in production? Let&amp;rsquo;s say it&amp;rsquo;s less risky than a year ago. On my personal cluster for fun, why not. For a production database with long-lived connections, I think I&amp;rsquo;d still pass ;-).&lt;/p&gt;
&lt;p&gt;But honestly, the project has evolved well and deserves a closer look.&lt;/p&gt;</description></item><item><title>nodes/proxy GET: One Kubernetes permission too many</title><link>https://blog.zwindler.fr/en/2026/05/19/nodes/proxy-get-one-kubernetes-permission-too-many/</link><pubDate>Tue, 19 May 2026 12:00:00 +0200</pubDate><guid>https://blog.zwindler.fr/en/2026/05/19/nodes/proxy-get-one-kubernetes-permission-too-many/</guid><description>&lt;img src="https://blog.zwindler.fr/2026/05/node_proxy_RCE.webp" alt="Featured image of post nodes/proxy GET: One Kubernetes permission too many" /&gt;&lt;h2 id="tldr"&gt;TL;DR
&lt;/h2&gt;&lt;p&gt;This is a bit old now, but I still wanted to share a quick write-up on the topic.&lt;/p&gt;
&lt;p&gt;Back in January, a cybersecurity researcher reported a Kubernetes flaw that generated quite a buzz. It had been a while since we had a Kube vulnerability that got people talking this much, at least from Denis&amp;rsquo;s memory (yes, I talk about myself in the third person).&lt;/p&gt;
&lt;p&gt;Honestly, it&amp;rsquo;s pretty wild: the &lt;code&gt;nodes/proxy GET&lt;/code&gt; RBAC permission allows any &lt;strong&gt;ServiceAccount&lt;/strong&gt; to execute code inside &lt;strong&gt;any Pod in the cluster&lt;/strong&gt;, without leaving a single trace in the audit logs. That&amp;rsquo;s unfortunate, especially when you have ServiceAccounts named &lt;code&gt;rook-ceph-system&lt;/code&gt; that also happen to have read access to all Secrets in the cluster.&lt;/p&gt;
&lt;p&gt;This article details the issue, how to check if you are vulnerable, the fixes to apply, and the preventive measures you can put in place if you can&amp;rsquo;t patch right away.&lt;/p&gt;
&lt;h2 id="the-problem-websocket--kubelet--un-audited-exec"&gt;The problem: WebSocket + Kubelet = un-audited exec
&lt;/h2&gt;&lt;p&gt;The vulnerability was documented by Graham Helton in &lt;a class="link" href="https://grahamhelton.com/blog/nodes-proxy-rce" target="_blank" rel="noopener"
&gt;this article&lt;/a&gt;. Here is how it works.&lt;/p&gt;
&lt;p&gt;The Kubernetes API exposes a &lt;code&gt;nodes/proxy&lt;/code&gt; subresource that proxies HTTP requests to each node&amp;rsquo;s Kubelet. The Kubelet itself exposes an API on port 10250, specifically the &lt;code&gt;/exec&lt;/code&gt; endpoint which allows executing commands inside a container.&lt;/p&gt;
&lt;p&gt;The issue comes from how the Kubelet handles authorizations for WebSocket connections:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;code&gt;kubectl exec&lt;/code&gt; uses a WebSocket connection, whose handshake is an HTTP &lt;code&gt;GET&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;The Kubelet maps this initial &lt;code&gt;GET&lt;/code&gt; to the &lt;code&gt;get&lt;/code&gt; RBAC verb&lt;/li&gt;
&lt;li&gt;It checks &lt;code&gt;nodes/proxy GET&lt;/code&gt;, then authorizes the operation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No secondary check&lt;/strong&gt; is performed for the &lt;code&gt;CREATE&lt;/code&gt; verb normally required for &lt;code&gt;/exec&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Result: any ServiceAccount with &lt;code&gt;nodes/proxy GET&lt;/code&gt; can execute commands in any Pod in the cluster, &lt;strong&gt;including system Pods&lt;/strong&gt; (&lt;code&gt;etcd&lt;/code&gt;, &lt;code&gt;kube-apiserver&lt;/code&gt;, etc.).&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Exploitation via websocat&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;websocat --insecure &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --header &lt;span class="s2"&gt;&amp;#34;Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$TOKEN&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --protocol &lt;span class="s2"&gt;&amp;#34;v4.channel.k8s.io&amp;#34;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="s2"&gt;&amp;#34;wss://&lt;/span&gt;&lt;span class="nv"&gt;$NODE_IP&lt;/span&gt;&lt;span class="s2"&gt;:10250/exec/default/nginx/nginx?output=1&amp;amp;error=1&amp;amp;command=id&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And that&amp;rsquo;s not all. &lt;strong&gt;Commands executed via this method do not generate any Kubernetes audit logs&lt;/strong&gt; (well, assuming you even collect them 🙈). The access goes directly through the Kubelet, which does not report events back to the API server.&lt;/p&gt;
&lt;p&gt;The official Kubernetes status on this: &lt;strong&gt;Won&amp;rsquo;t Fix&lt;/strong&gt;. It is a &amp;ldquo;design behavior&amp;rdquo; (note the quotes), addressed via a feature gate (&lt;a class="link" href="https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2862-fine-grained-kubelet-authz/README.md" target="_blank" rel="noopener"
&gt;KEP-2862&lt;/a&gt;, see below).&lt;/p&gt;
&lt;p&gt;Ouch.&lt;/p&gt;
&lt;h2 id="the-audit-vulnerable-serviceaccounts-on-our-clusters"&gt;The Audit: Vulnerable ServiceAccounts on our clusters
&lt;/h2&gt;&lt;p&gt;In January 2026, following the publication of Graham Helton&amp;rsquo;s article, a lot of people had to urgently audit their clusters. You can either manually audit all your Roles / ClusterRoles, or use a &lt;a class="link" href="https://gist.github.com/grahamhelton/f5c8ce265161990b0847ac05a74e466a" target="_blank" rel="noopener"
&gt;detection script&lt;/a&gt; provided by the researcher.&lt;/p&gt;
&lt;p&gt;As an example, here are three relatively common components that make great candidates for a juicy privilege escalation:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;ClusterRole&lt;/th&gt;
&lt;th&gt;ServiceAccounts&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenTelemetry Collector&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;otel-otelcol-k8sobjects&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;opentelemetry-collector-daemonset-collector&lt;/code&gt;, &lt;code&gt;opentelemetry-collector-deployment-collector&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenTelemetry Operator&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;otel-operator-resources&lt;/code&gt; / &lt;code&gt;opentelemetry-operator-manager&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;opentelemetry-operator&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rook-Ceph&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rook-ceph-global&lt;/code&gt;, &lt;code&gt;rook-ceph-mgr-cluster&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rook-ceph-system&lt;/code&gt;, &lt;code&gt;rook-ceph-mgr&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Note: there are many more. Graham Helton added an &amp;ldquo;Appendix: Affected Helm Charts&amp;rdquo; section at the end of his article, referencing AT LEAST 69 affected Helm charts according to him.&lt;/p&gt;
&lt;h3 id="the-critical-case-rook-ceph-system"&gt;The critical case: rook-ceph-system
&lt;/h3&gt;&lt;p&gt;In the official chart, the &lt;code&gt;rook-ceph-system&lt;/code&gt; ServiceAccount combined two particularly dangerous permissions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;nodes/proxy GET&lt;/code&gt;&lt;/strong&gt; - the RCE&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;secrets GET/LIST/WATCH&lt;/code&gt;&lt;/strong&gt; across the &lt;strong&gt;entire cluster&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Accessible secrets can include LUKS keys for volume encryption, Ceph admin keyrings, dashboard passwords&amp;hellip; this kind of access makes it a prime target for an attacker.&lt;/p&gt;
&lt;p&gt;The attack scenario: a compromise of the &lt;code&gt;rook-ceph-operator&lt;/code&gt; Pod (via CVE, supply chain, or a malicious image) would allow reading all Ceph secrets, and then executing code in any Pod (including &lt;code&gt;etcd&lt;/code&gt;), leading to a full compromise of the cluster and encrypted data.&lt;/p&gt;
&lt;p&gt;To manually check if a ServiceAccount is vulnerable:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl auth can-i get nodes --subresource&lt;span class="o"&gt;=&lt;/span&gt;proxy &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --as&lt;span class="o"&gt;=&lt;/span&gt;system:serviceaccount:&amp;lt;namespace&amp;gt;:&amp;lt;serviceaccount&amp;gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="examples-of-fixes-to-apply"&gt;Examples of fixes to apply
&lt;/h2&gt;&lt;h3 id="rook-ceph-upstream-fix"&gt;Rook-Ceph: Upstream fix
&lt;/h3&gt;&lt;p&gt;For Rook-Ceph, the fix came from upstream: PR &lt;a class="link" href="https://github.com/rook/rook/pull/16979" target="_blank" rel="noopener"
&gt;rook/rook#16979&lt;/a&gt; removed &lt;code&gt;nodes/proxy&lt;/code&gt; from ClusterRoles. This fix is included in Rook v1.19.1, so updating the affected clusters was enough.&lt;/p&gt;
&lt;p&gt;After updating, checking across all clusters:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl get clusterroles rook-ceph-global -o yaml &lt;span class="p"&gt;|&lt;/span&gt; grep -A3 nodes/proxy
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# -&amp;gt; nothing&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="otel--otel-operator"&gt;OTel / OTel operator
&lt;/h3&gt;&lt;p&gt;For OpenTelemetry, the situation is potentially more complex. If you are using &lt;code&gt;otel-operator&lt;/code&gt; and &lt;strong&gt;OtelCollector&lt;/strong&gt; Custom Resources, you likely have to manage your own RBAC manifests &lt;strong&gt;yourself&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Having had to do it myself, it&amp;rsquo;s quite painful. Depending on your collector type and the receivers you enabled, you need to cross-reference multiple documents on the official OTel and otel-operator websites.&lt;/p&gt;
&lt;p&gt;Upstream merged a conditional approach in &lt;a class="link" href="https://github.com/open-telemetry/opentelemetry-helm-charts/pull/2083" target="_blank" rel="noopener"
&gt;open-telemetry/opentelemetry-helm-charts#2083&lt;/a&gt; based on the Kubernetes version:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# Upstream approach (opentelemetry-helm-charts#2083)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;{{- &lt;span class="l"&gt;if semverCompare &amp;#34;&amp;gt;=1.33-0&amp;#34; .Capabilities.KubeVersion.Version }}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/pods&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;{{- &lt;span class="l"&gt;else }}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/proxy&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;{{- &lt;span class="l"&gt;end }}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Here again, if all your clusters are up to date, you can simply replace &lt;code&gt;nodes/proxy&lt;/code&gt; with &lt;code&gt;nodes/pods&lt;/code&gt; directly (without the condition).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;otel-collector-crb.yaml&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# Before&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;rules&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;apiGroups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/proxy &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# &amp;lt;- RCE risk&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/spec&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/stats&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;verbs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;get&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# After&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;rules&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;apiGroups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# nodes/pods replaces nodes/proxy (RCE risk, see [https://grahamhelton.com/blog/nodes-proxy-rce](https://grahamhelton.com/blog/nodes-proxy-rce))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# Requires K8s &amp;gt;= 1.33 (KEP-2862 fine-grained kubelet authz)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/pods&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/spec&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/stats&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;verbs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;get&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;&lt;code&gt;otel-operator-rbac.yaml&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# Before&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;apiGroups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/proxy &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# &amp;lt;- RCE risk&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;verbs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;get&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# After&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# nodes/pods replaces nodes/proxy (RCE risk, see [https://grahamhelton.com/blog/nodes-proxy-rce](https://grahamhelton.com/blog/nodes-proxy-rce))&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# Requires K8s &amp;gt;= 1.33 (KEP-2862 fine-grained kubelet authz)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;apiGroups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;nodes/pods&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;verbs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;get&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="preventive-measures"&gt;Preventive Measures
&lt;/h2&gt;&lt;h3 id="kep-2862-fine-grained-kubelet-api-authorization"&gt;KEP-2862: Fine-Grained Kubelet API Authorization
&lt;/h3&gt;&lt;p&gt;As teased earlier, the real long-term solution is &lt;a class="link" href="https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2862-fine-grained-kubelet-authz/README.md" target="_blank" rel="noopener"
&gt;KEP-2862&lt;/a&gt; (Fine-Grained Kubelet API Authorization). It introduces granular subresources (&lt;code&gt;nodes/pods&lt;/code&gt;, &lt;code&gt;nodes/metrics&lt;/code&gt;, &lt;code&gt;nodes/stats&lt;/code&gt;, &lt;code&gt;nodes/log&lt;/code&gt;, etc.) allowing precise access without using &lt;code&gt;nodes/proxy&lt;/code&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;K8s Version&lt;/th&gt;
&lt;th&gt;KEP-2862 Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1.32&lt;/td&gt;
&lt;td&gt;Alpha&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.33&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Beta, enabled by default&lt;/strong&gt; - &lt;code&gt;nodes/proxy GET&lt;/code&gt; no longer grants access to &lt;code&gt;/exec&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1.36&lt;/td&gt;
&lt;td&gt;GA (locked to enabled)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;But this requires going through EVERY chart in use and checking all deployed manifests now &lt;strong&gt;and in the future&lt;/strong&gt;.&lt;/p&gt;
&lt;h3 id="ciliumnetworkpolicy-blocking-the-kubelet-port"&gt;CiliumNetworkPolicy: Blocking the Kubelet port
&lt;/h3&gt;&lt;p&gt;While waiting for the K8s upgrade, or as defense-in-depth, you can block access to port 10250 from the affected pods using NetworkPolicies (or CiliumNetworkPolicy if you use Cilium as your CNI plugin).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Warning&lt;/strong&gt;: This only applies to components that do &lt;em&gt;not&lt;/em&gt; need to access the Kubelet. The OTel collector potentially needs it to gather Kubelet metrics. In that case, you have no choice but to fix the RBAC.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;cilium.io/v2&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;CiliumNetworkPolicy&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;deny-kubelet-api-access&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;&amp;lt;namespace&amp;gt;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;endpointSelector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;matchLabels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;&amp;lt;app-label&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;&amp;lt;value&amp;gt;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;egressDeny&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;toEntities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;host&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;remote-node&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;toPorts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;10250&amp;#34;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;protocol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;TCP&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="kyverno-blocking-the-creation-of-new-roles-with-nodesproxy"&gt;Kyverno: Blocking the creation of new Roles with nodes/proxy
&lt;/h3&gt;&lt;p&gt;To prevent any regression (remember, we need to protect ourselves &lt;strong&gt;in the future&lt;/strong&gt;), we can add a Kyverno &lt;code&gt;ClusterPolicy&lt;/code&gt; that rejects the creation or modification of a &lt;code&gt;ClusterRole&lt;/code&gt; or &lt;code&gt;Role&lt;/code&gt; containing &lt;code&gt;nodes/proxy&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Luckily, there are ready-to-use examples on the official Kyverno website:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://kyverno.io/policies/other-vpol/restrict-clusterrole-nodesproxy/restrict-clusterrole-nodesproxy/" target="_blank" rel="noopener"
&gt;validating variant&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://kyverno.io/policies/other-cel/restrict-clusterrole-nodesproxy/restrict-clusterrole-nodesproxy/" target="_blank" rel="noopener"
&gt;CEL variant&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Note : &lt;a class="link" href="https://github.com/kyverno/policies/issues/1492" target="_blank" rel="noopener"
&gt;there probably is a hole in the official Kyverno policy, I opened an issue&lt;/a&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;kyverno.io/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;ClusterPolicy&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;restrict-nodes-proxy&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;validationFailureAction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Audit &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# switch to Enforce after validation&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;background&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;rules&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;deny-nodes-proxy-in-clusterroles&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;any&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;kinds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;ClusterRole&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;Role&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;exclude&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;any&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="s2"&gt;&amp;#34;system:kubelet-api-admin&amp;#34;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# built-in K8s, unmodifiable&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;&lt;span class="sd"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; nodes/proxy grants RCE capability via Kubelet WebSocket exec.
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; Use nodes/pods (requires K8s &amp;gt;= 1.33, KEP-2862) instead.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;deny&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;conditions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;any&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;nodes/proxy&amp;#34;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;AnyIn&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;{{ request.object.rules[].resources[] }}&amp;#34;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Deployment is done in two stages: first in &lt;code&gt;Audit&lt;/code&gt; mode to ensure there are no remaining vulnerable manifests (which you should fix &lt;strong&gt;before&lt;/strong&gt; blocking), then in &lt;code&gt;Enforce&lt;/code&gt; mode to actually block them.&lt;/p&gt;
&lt;h3 id="monitoring-the-audit-log"&gt;Monitoring the Audit Log
&lt;/h3&gt;&lt;p&gt;Even if commands executed &lt;em&gt;via&lt;/em&gt; the Kubelet leave no trace, we can monitor &lt;strong&gt;SubjectAccessReviews&lt;/strong&gt; to detect attempts at enumerating &lt;code&gt;nodes/proxy&lt;/code&gt; permissions.&lt;/p&gt;
&lt;p&gt;The configuration in the Kubernetes audit policy:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# audit-policy.yaml&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;- &lt;span class="nt"&gt;level&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Request&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;verbs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;create&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;group&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;authorization.k8s.io&amp;#34;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;subjectaccessreviews&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Then a Prometheus/Alertmanager alert on SARs related to &lt;code&gt;nodes/proxy&lt;/code&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-promql" data-lang="promql"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Detect SARs targeting nodes/proxy&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kr"&gt;increase&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;apiserver_audit_event_total&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;verb&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;#34;&lt;/span&gt;&lt;span class="s"&gt;create&lt;/span&gt;&lt;span class="p"&gt;&amp;#34;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;resource&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;#34;&lt;/span&gt;&lt;span class="s"&gt;subjectaccessreviews&lt;/span&gt;&lt;span class="p"&gt;&amp;#34;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}[&lt;/span&gt;&lt;span class="s"&gt;5m&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="references"&gt;References
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://grahamhelton.com/blog/nodes-proxy-rce" target="_blank" rel="noopener"
&gt;Kubernetes RCE via nodes/proxy GET&lt;/a&gt; - Graham Helton&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://gist.github.com/grahamhelton/f5c8ce265161990b0847ac05a74e466a" target="_blank" rel="noopener"
&gt;Detection Script&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2862-fine-grained-kubelet-authz/README.md" target="_blank" rel="noopener"
&gt;KEP-2862 Fine-Grained Kubelet API Authorization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://kubernetes.io/docs/concepts/security/rbac-good-practices/#access-to-proxy-subresource-of-nodes" target="_blank" rel="noopener"
&gt;Kubernetes RBAC Good Practices - nodes/proxy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/rook/rook/pull/16979" target="_blank" rel="noopener"
&gt;rook/rook#16979&lt;/a&gt; - Fix upstream Rook-Ceph&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/open-telemetry/opentelemetry-helm-charts/pull/2083" target="_blank" rel="noopener"
&gt;open-telemetry/opentelemetry-helm-charts#2083&lt;/a&gt; - Fix upstream OTel&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Kubernetes UserNamespaces: the overhyped GA feature</title><link>https://blog.zwindler.fr/en/2026/04/28/kubernetes-usernamespaces-the-overhyped-ga-feature/</link><pubDate>Tue, 28 Apr 2026 10:00:00 +0200</pubDate><guid>https://blog.zwindler.fr/en/2026/04/28/kubernetes-usernamespaces-the-overhyped-ga-feature/</guid><description>&lt;img src="https://blog.zwindler.fr/2026/04/usernamespaces.webp" alt="Featured image of post Kubernetes UserNamespaces: the overhyped GA feature" /&gt;&lt;h2 id="the-infographic-that-triggered-me"&gt;The infographic that triggered me
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Note: I stumbled upon this GenAI infographic and it was so bad I wrote a post about it. I didn&amp;rsquo;t generate this thing.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Over the past few days, LinkedIn has been flooded with the same kind of infographic. Kubernetes 1.36 is out, and one of the most talked-about features is the GA release of &lt;strong&gt;UserNamespaces&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s a topic I&amp;rsquo;ve been following since 2018 (talk &lt;a class="link" href="https://blog.zwindler.fr/2018/05/03/recap-du-premier-jour-de-kubecon-europe-2018/" target="_blank" rel="noopener"
&gt;The Route to rootless containers&lt;/a&gt; at KubeCon EU 2018), so I&amp;rsquo;m genuinely glad to see this long journey finally reach the finish line. That said, I&amp;rsquo;m appalled by the way it&amp;rsquo;s being marketed on LinkedIn, apparently by people who have no idea how it actually works — and frankly don&amp;rsquo;t care.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&amp;ldquo;Kubernetes just made root safer. Just add &lt;code&gt;hostUsers: false&lt;/code&gt; to your Pod spec.&amp;rdquo;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The visual: an all-powerful king &amp;ldquo;inside the container&amp;rdquo; and a helpless beggar &amp;ldquo;outside on the host&amp;rdquo;. The promise: &amp;ldquo;No Host Access. No Privilege Escalation. No Lateral Movement. No Node Takeover.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;Catchy.&lt;/p&gt;
&lt;p&gt;But presenting it this way is genuinely dangerous, because it obscures entire areas of application and operational security. Selling &lt;code&gt;hostUsers: false&lt;/code&gt; as the universal fix for the &amp;ldquo;root in containers&amp;rdquo; problem is a dramatic oversimplification that will push teams to ignore the real security &lt;strong&gt;priorities&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id="what-usernamespaces-actually-do"&gt;What UserNamespaces actually do
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;The threat model: container escape&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;First, what are we actually talking about? A &lt;strong&gt;container escape&lt;/strong&gt; is when an attacker manages to break out of their container and directly access the host&amp;rsquo;s kernel or filesystem — completely bypassing the normal isolation mechanisms.&lt;/p&gt;
&lt;p&gt;This type of vulnerability is &lt;strong&gt;rare&lt;/strong&gt;, but real-world examples exist:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class="link" href="https://nvd.nist.gov/vuln/detail/CVE-2019-5736" target="_blank" rel="noopener"
&gt;CVE-2019-5736&lt;/a&gt;&lt;/strong&gt; (runc): write to &lt;code&gt;/proc/self/exe&lt;/code&gt; of the host process from inside the container&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class="link" href="https://nvd.nist.gov/vuln/detail/CVE-2022-0492" target="_blank" rel="noopener"
&gt;CVE-2022-0492&lt;/a&gt;&lt;/strong&gt; (cgroups v1): escape via &lt;code&gt;unshare&lt;/code&gt; in certain configurations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class="link" href="https://nvd.nist.gov/vuln/detail/CVE-2024-21626" target="_blank" rel="noopener"
&gt;CVE-2024-21626&lt;/a&gt;&lt;/strong&gt; (runc, &amp;ldquo;Leaky Vessels&amp;rdquo;): file descriptor leak to the host working directory&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If there&amp;rsquo;s a vulnerability of this type on your node AND a process is compromised AND it runs as root in the container AND it doesn&amp;rsquo;t use UserNamespaces, the attacker gets &lt;strong&gt;root on the host&lt;/strong&gt; — &lt;strong&gt;game over&lt;/strong&gt;. Full access to every file on the node, every secret mounted by other pods, ability to install a rootkit or exfiltrate data from all tenants running on that node.&lt;/p&gt;
&lt;p&gt;It remains possible, but that&amp;rsquo;s a lot of &amp;ldquo;ifs&amp;rdquo;. Either way, this is exactly the scenario UserNamespaces address. They introduce &lt;strong&gt;UID mapping&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;UID 0 inside the container is mapped to an unprivileged UID on the host (e.g. 100000, unique per pod)&lt;/li&gt;
&lt;li&gt;If an attacker successfully escapes the container via a kernel exploit, they land as &lt;strong&gt;&lt;code&gt;nobody&lt;/code&gt;&lt;/strong&gt; on the node — the escape succeeds, but the post-escape impact is dramatically reduced&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the &amp;ldquo;Breakouts Lose Impact&amp;rdquo; scenario from the infographic, and on that point, &lt;strong&gt;the infographic is right&lt;/strong&gt;. That&amp;rsquo;s the real value of the feature.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Edge case: multi-tenancy even with non-root containers&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Even without root containers, UserNamespaces provide something extra in a &lt;strong&gt;truly multi-tenant&lt;/strong&gt; context (multiple customers on the same cluster). Without UserNamespaces, if two pods from different customers both run with &lt;code&gt;runAsUser: 1000&lt;/code&gt;, they share the same UID 1000 on the node. If one escapes, the attacker can access files from the other pod with the same owner. UserNamespaces, by assigning a unique UID offset per pod, isolates UIDs between pods even when they use the same value inside the container.&lt;/p&gt;
&lt;p&gt;For internal clusters where you control all workloads, this scenario is theoretical. For a multi-tenant SaaS platform or a public build service, it&amp;rsquo;s a real line of defense.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Technical requirements&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There are a few prerequisites, but most up-to-date clusters should qualify.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Linux kernel ≥ 5.19&lt;/li&gt;
&lt;li&gt;Compatible runtime (containerd ≥ 1.7, CRI-O ≥ 1.25)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Idmapped mounts&lt;/em&gt; support for persistent volumes (XFS, ext4 — not NFS in all cases)&lt;/li&gt;
&lt;li&gt;Kubernetes ≥ 1.33 (Beta), ≥ 1.36 (GA)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="what-the-infographic-exaggerates-and-leaves-out"&gt;What the infographic exaggerates (and leaves out)
&lt;/h2&gt;&lt;p&gt;The infographic is right about one specific thing: UserNamespaces reduces the impact of a successful container escape. That&amp;rsquo;s real. The problem is it sells the feature as a universal solution to &amp;ldquo;root in containers&amp;rdquo; — and that&amp;rsquo;s just wrong.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;1. UID isolation is not application privilege isolation&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The infographic promises &amp;ldquo;No Lateral Movement&amp;rdquo;. That&amp;rsquo;s false — completely false.&lt;/p&gt;
&lt;p&gt;A root container with &lt;code&gt;hostUsers: false&lt;/code&gt; can still read the &lt;strong&gt;ServiceAccount Token&lt;/strong&gt; mounted at &lt;code&gt;/var/run/secrets/kubernetes.io/serviceaccount/token&lt;/code&gt;. If that token has broad RBAC permissions (which happens — we may cover this in a future post), the attacker can call the API Server, enumerate cluster resources, and move laterally — all without ever touching the host node.&lt;/p&gt;
&lt;p&gt;UID mapping protects the host. It does not protect the cluster.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. A root container is still root inside the container&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&amp;ldquo;Install Anything ✅&amp;rdquo; — it&amp;rsquo;s literally written in the infographic, presented as a feature 😖.&lt;/p&gt;
&lt;p&gt;In a root container (even with UserNS), an attacker who gains control can:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Install &lt;code&gt;nmap&lt;/code&gt;, &lt;code&gt;curl&lt;/code&gt;, &lt;code&gt;nc&lt;/code&gt; to scan the internal network&lt;/li&gt;
&lt;li&gt;Modify application files, binaries, configurations&lt;/li&gt;
&lt;li&gt;Read all files mounted as volumes&lt;/li&gt;
&lt;li&gt;Persist in the container across restarts if the filesystem is writable&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;UserNamespaces removes none of these attack vectors. The ability to install software is a fast track to lateral movement.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;3. It&amp;rsquo;s not that easy, especially for storage&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Enabling &lt;code&gt;hostUsers: false&lt;/code&gt; breaks existing storage in most cases.&lt;/p&gt;
&lt;p&gt;Container UID 0 is mapped to UID 100000+ on the host (each container has its own offset). If a persistent volume (NFS, EBS, Ceph RBD) is owned by UID 1000, the root container can&amp;rsquo;t read or write it. The result: counterintuitive &lt;code&gt;Permission Denied&lt;/code&gt; errors that are potentially hard to diagnose, since the application was likely never designed to be root yet have no access to its own files.&lt;/p&gt;
&lt;p&gt;The technical solution exists (&lt;em&gt;idmapped mounts&lt;/em&gt;), but it requires a recent kernel and a compatible filesystem. See the &lt;a class="link" href="https://www.kernel.org/doc/html/latest/filesystems/idmappings.html" target="_blank" rel="noopener"
&gt;official idmapped mounts documentation&lt;/a&gt; for details.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;4. Same story, but for networking&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;code&gt;hostUsers: false&lt;/code&gt; is incompatible with &lt;code&gt;hostNetwork: true&lt;/code&gt;. It&amp;rsquo;s a corner case, but it catches networking workloads (monitoring agents, CNI plugins, etc.).&lt;/p&gt;
&lt;p&gt;Note: that said, running containers with &lt;code&gt;hostNetwork&lt;/code&gt; is &lt;strong&gt;its own security problem&lt;/strong&gt;, so&amp;hellip;&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id="honest-comparison-userns-vs-the-real-alternatives"&gt;Honest comparison: UserNS vs the real alternatives
&lt;/h2&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th style="text-align: left"&gt;Attack vector&lt;/th&gt;
&lt;th style="text-align: center"&gt;UserNS (root inside)&lt;/th&gt;
&lt;th style="text-align: center"&gt;Non-root (UID 1000)&lt;/th&gt;
&lt;th style="text-align: center"&gt;Distroless / Scratch&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;Post-escape impact after successful container escape&lt;/td&gt;
&lt;td style="text-align: center"&gt;✅ Nobody on host&lt;/td&gt;
&lt;td style="text-align: center"&gt;⚠️ UID 1000 on host&lt;/td&gt;
&lt;td style="text-align: center"&gt;⚠️ UID 1000 on host&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;UID isolation between pods (multi-tenant)&lt;/td&gt;
&lt;td style="text-align: center"&gt;✅ Unique offset per pod&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Shared UID on node&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Shared UID on node&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;Malware installation inside the container&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Trivial&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Possible&lt;/td&gt;
&lt;td style="text-align: center"&gt;✅ Near impossible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;Write scope in ephemeral container FS&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Full filesystem&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ App directory only&lt;/td&gt;
&lt;td style="text-align: center"&gt;✅ Near impossible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;Lateral movement via SA Token&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Possible&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Possible&lt;/td&gt;
&lt;td style="text-align: center"&gt;⚠️ Potentially difficult&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;Operational complexity&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Sometimes high&lt;/td&gt;
&lt;td style="text-align: center"&gt;✅ Often near zero&lt;/td&gt;
&lt;td style="text-align: center"&gt;✅ Often low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td style="text-align: left"&gt;Compatibility with existing storage&lt;/td&gt;
&lt;td style="text-align: center"&gt;❌ Sometimes problematic&lt;/td&gt;
&lt;td style="text-align: center"&gt;✅ Standard&lt;/td&gt;
&lt;td style="text-align: center"&gt;✅ Standard&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Reading the table reveals the true nature of UserNamespaces: it excels on &lt;strong&gt;exactly two rows&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;post-escape impact&lt;/li&gt;
&lt;li&gt;UID isolation in multi-tenant environments&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;On everything else, non-root + distroless does better, or just as well, without the operational complexity. And that &amp;ldquo;everything else&amp;rdquo; — write scope in ephemeral FS, malware installation, lateral movement via SA Token — represents the vast majority of real-world attack vectors, far more common than a container escape. We&amp;rsquo;ll come back to this in the &lt;a class="link" href="#where-to-invest-your-security-budget" &gt;Where to invest your security budget&lt;/a&gt; section.&lt;/p&gt;
&lt;h2 id="real-use-cases"&gt;Real use cases
&lt;/h2&gt;&lt;p&gt;It would be dishonest to dismiss the feature entirely. There are three scenarios where UserNamespaces aren&amp;rsquo;t a lazy option but a genuine technical necessity (with caveats).&lt;/p&gt;
&lt;h3 id="1-build-as-a-service-buildah-rootless-podman"&gt;1. Build-as-a-Service (Buildah, rootless Podman)
&lt;/h3&gt;&lt;p&gt;To build a Docker image, the build engine needs to perform &lt;code&gt;chown&lt;/code&gt;, &lt;code&gt;chmod&lt;/code&gt; and &lt;code&gt;mknod&lt;/code&gt;. These operations require &lt;code&gt;CAP_CHOWN&lt;/code&gt; and &lt;code&gt;CAP_FOWNER&lt;/code&gt;. Before UserNamespaces, the solution was to run the pod as &lt;code&gt;--privileged&lt;/code&gt; — an obvious open door to the host.&lt;/p&gt;
&lt;p&gt;With &lt;code&gt;hostUsers: false&lt;/code&gt;, the build engine believes it&amp;rsquo;s root for its own file manipulation, but it can&amp;rsquo;t touch the host. This is the only case where &amp;ldquo;root inside&amp;rdquo; is a technical constraint rather than technical debt.&lt;/p&gt;
&lt;p&gt;Note: &lt;a class="link" href="https://github.com/GoogleContainerTools/kaniko" target="_blank" rel="noopener"
&gt;Kaniko&lt;/a&gt;, long the go-to for in-cluster builds, has been archived since October 2025 and no longer receives security updates. Buildah or rootless Podman are the active alternatives.&lt;/p&gt;
&lt;p&gt;My take: it can be useful for shared CI/CD platforms (GitLab Runners, Tekton) that refuse privileged pods. But if isolation is critical (public platform, aggressive multi-tenancy), microVMs (Kata Containers, Firecracker) offer far stronger guarantees for an overhead that has become quite manageable.&lt;/p&gt;
&lt;h3 id="2-hostile-multi-tenancy-user-code-platforms"&gt;2. Hostile multi-tenancy (user code platforms)
&lt;/h3&gt;&lt;p&gt;If your business is running code provided by strangers (PaaS, online code editors, public CI/CD), you know upfront that users &lt;em&gt;will&lt;/em&gt; try to escalate their privileges. In this context, UserNS is an extra barrier against kernel 0-days.&lt;/p&gt;
&lt;p&gt;My take: honestly, if the environment is truly &lt;strong&gt;hostile&lt;/strong&gt;, UserNS alone isn&amp;rsquo;t enough. MicroVMs (Kata Containers, Firecracker) provide real hardware isolation and are the right choice here. UserNS can be a complement, not a substitute.&lt;/p&gt;
&lt;h3 id="3-hard-coded-legacy-postfix-dovecot-bind"&gt;3. Hard-coded legacy (Postfix, Dovecot, BIND)
&lt;/h3&gt;&lt;p&gt;Some old UNIX daemons start as root to open a privileged port (&amp;lt; 1024) or read sensitive config files, then drop privileges via &lt;code&gt;setuid()&lt;/code&gt;. This mechanism fails in a classic non-root container.&lt;/p&gt;
&lt;p&gt;UserNamespaces let these processes believe they can make their identity management syscalls, because they are root inside their namespace.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s a concrete example written by a colleague (thanks Louis 😘):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;Pod&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;postfix&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;hostUsers: false # UID mapping&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;root in container → nobody on host&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;runAsNonRoot: false # allowed under PSS Restricted *only* because of hostUsers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;fsGroup&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;103&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# postfix GID&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;postfix&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;postfix:latest&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;runAsNonRoot&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# same — cf. https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/#integration-with-pod-security-admission-checks&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;allowPrivilegeEscalation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;readOnlyRootFilesystem&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;seccompProfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;RuntimeDefault&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;capabilities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;ALL&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# drop everything first&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;add&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;SETUID &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# privilege drop via setuid() done by postfix itself at startup&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;SETGID &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# same for groups&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;CHOWN &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# chown on mail queues at startup&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;FOWNER &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# file operations without being the owner&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;FSETID &lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# preserve setuid bit after write&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;DAC_OVERRIDE # MANDATORY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;root in UserNS is not &amp;#34;real&amp;#34; root —&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# DAC checks are not automatically bypassed&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This manifest illustrates several important things.&lt;/p&gt;
&lt;p&gt;First, making a legacy application actually secure with UserNS is painful and requires compromises — especially around capabilities. This is a far cry from the &amp;ldquo;magic feature that secures root apps&amp;rdquo; the LinkedIn wanna-be influencers imply.&lt;/p&gt;
&lt;p&gt;Then there are some interesting surprises. Normally, the &lt;code&gt;Restricted&lt;/code&gt; Pod Security Standard forbids &lt;code&gt;runAsNonRoot: false&lt;/code&gt;. Kubernetes makes an exception when &lt;code&gt;hostUsers: false&lt;/code&gt; is present. This is documented &lt;a class="link" href="https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/#integration-with-pod-security-admission-checks" target="_blank" rel="noopener"
&gt;here&lt;/a&gt;. Without UserNamespaces, this pod would be rejected by the admission controller.&lt;/p&gt;
&lt;p&gt;There&amp;rsquo;s also the &lt;strong&gt;&lt;code&gt;DAC_OVERRIDE&lt;/code&gt; capability&lt;/strong&gt;, which is counterintuitive. Root in a UserNS is not real root from the kernel&amp;rsquo;s perspective for DAC (Discretionary Access Control) checks. When Postfix runs &lt;code&gt;set-permissions&lt;/code&gt; to &lt;code&gt;chown&lt;/code&gt; its queues, the kernel still verifies permissions — and denies them if &lt;code&gt;DAC_OVERRIDE&lt;/code&gt; isn&amp;rsquo;t present. This is exactly the kind of operational surprise that stays invisible until the first production deployment.&lt;/p&gt;
&lt;p&gt;Worth noting: we were still able to keep &lt;code&gt;readOnlyRootFilesystem: true&lt;/code&gt; and &lt;code&gt;allowPrivilegeEscalation: false&lt;/code&gt; — legacy doesn&amp;rsquo;t justify throwing everything overboard.&lt;/p&gt;
&lt;p&gt;My take: this is the only use case where UserNS is genuinely acceptable. No untrusted third-party code, no hostile platform — just well-identified legacy with a &lt;strong&gt;migration plan&lt;/strong&gt;. The other two cases are &amp;ldquo;acceptable under conditions&amp;rdquo;; the legacy case is the cleanest of the three.&lt;/p&gt;
&lt;h2 id="some-counterarguments"&gt;Some counterarguments
&lt;/h2&gt;&lt;p&gt;I see you coming with objections, so let&amp;rsquo;s save everyone some time with a quick Q&amp;amp;A:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;It&amp;rsquo;s defense in depth.&amp;rdquo;&lt;/strong&gt;
True — but defense in depth assumes the foundational layers are already in place. If you haven&amp;rsquo;t migrated your images to non-root yet, investing energy in UserNS is putting the cart before the horse. And once you&amp;rsquo;re non-root, the marginal gain of UserNS is negligible compared to the complexity it introduces.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;We don&amp;rsquo;t control third-party images.&amp;rdquo;&lt;/strong&gt;
Somewhat weak, in my opinion: if a proprietary vendor&amp;rsquo;s black-box image is hardcoded to run as root, there&amp;rsquo;s a good chance it either genuinely needs it (as is the case for some proprietary security tooling) or it will break with UID mapping (see the storage problem above). UserNS is not a magic wand that makes any third-party image compatible and secure.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;It&amp;rsquo;s a centralized safeguard against human error.&amp;rdquo;&lt;/strong&gt;
It&amp;rsquo;s just as easy to forget &lt;code&gt;hostUsers: false&lt;/code&gt; as it is to forget &lt;code&gt;runAsNonRoot: true&lt;/code&gt;. The real centralized solution is &lt;strong&gt;Pod Security Standards&lt;/strong&gt; or an Admission Controller (Kyverno, OPA) that outright rejects root pods. Simpler, more reliable, and it doesn&amp;rsquo;t break storage.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&amp;ldquo;We need it for SOC2/PCI-DSS/&amp;hellip; compliance.&amp;rdquo;&lt;/strong&gt;
If your compliance requires strict tenant isolation, UserNS will likely be deemed insufficient by your auditors. VMs or microVMs remain the gold standard. Using UserNS for compliance means choosing the most complex tool to maintain for a result that remains debatable.&lt;/p&gt;
&lt;h2 id="where-to-invest-your-security-budget"&gt;Where to invest your security budget
&lt;/h2&gt;&lt;p&gt;Setting aside the marketing, here&amp;rsquo;s where effort actually pays off — from highest impact to most niche:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Priority 1 — Non-root images + &lt;code&gt;nobody&lt;/code&gt; (UID 65534)&lt;/strong&gt;
Move images to non-root, ideally using the &lt;code&gt;nobody&lt;/code&gt; user (the least privileged on the system). If an application is compromised under &lt;code&gt;nobody&lt;/code&gt;, the attacker can do almost nothing, even on the container filesystem. Combine with &lt;code&gt;readOnlyRootFilesystem: true&lt;/code&gt; and &lt;code&gt;capabilities: drop: [&amp;quot;ALL&amp;quot;]&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;runAsNonRoot&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;runAsUser&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="m"&gt;65534&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c"&gt;# nobody&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;seccompProfile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;RuntimeDefault&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;app&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;my-app:distroless&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;securityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;allowPrivilegeEscalation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;capabilities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;ALL&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;readOnlyRootFilesystem&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Priority 2 — Pod Security Standards (PSS) at &lt;code&gt;Baseline&lt;/code&gt; or &lt;code&gt;Restricted&lt;/code&gt;&lt;/strong&gt;
Block root and privileges without breaking anything at the infra level. It requires having done Priority 1 first, but it&amp;rsquo;s free, standard, and applies cluster-wide via a namespace label (with per-namespace overrides when needed). No more risk of forgetting. Already enabled by default on several Kubernetes distributions (Talos being one, but not the only one).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Priority 3 — MicroVMs (Kata Containers, Firecracker)&lt;/strong&gt;
For truly untrusted workloads. Real hardware isolation, overhead now quite reasonable on recent generations.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Priority 4 — UserNamespaces&lt;/strong&gt;
When all else fails. Only for the legitimate cases identified above (builds, legacy, hostile multi-tenancy). This is genuinely the &lt;strong&gt;last&lt;/strong&gt; thing to do.&lt;/p&gt;
&lt;h2 id="conclusion"&gt;Conclusion
&lt;/h2&gt;&lt;p&gt;Kubernetes 1.36 UserNamespaces are the result of a project that officially took five years (KEP-127 dates back to 2021) and has been discussed since nearly the dawn of Kubernetes. For shared build platforms and multi-tenant SaaS running user-provided code, it&amp;rsquo;s a potentially useful building block — particularly to prevent one customer&amp;rsquo;s app from reading another&amp;rsquo;s in the event of a container escape without privilege escalation.&lt;/p&gt;
&lt;p&gt;For everything else — that is to say, 99% of production clusters — that&amp;rsquo;s not where container security starts. And that&amp;rsquo;s precisely the problem with this kind of infographic.&lt;/p&gt;
&lt;p&gt;LinkedIn infographics selling effortless security are dangerous: &amp;ldquo;keep your 800MB root image full of tools, just add &lt;code&gt;hostUsers: false&lt;/code&gt;, and you&amp;rsquo;re protected.&amp;rdquo; That&amp;rsquo;s exactly &lt;strong&gt;the wrong approach&lt;/strong&gt;. Real container security is built in the Dockerfile, not in the PodSpec.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;If you&amp;rsquo;re enabling UserNamespaces to secure an application whose source code you own, you&amp;rsquo;ve probably missed a step in your secure software development lifecycle.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id="references"&gt;References
&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/127-user-namespaces/README.md" target="_blank" rel="noopener"
&gt;KEP-127: Support for User Namespaces&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://kubernetes.io/docs/concepts/workloads/pods/user-namespaces/" target="_blank" rel="noopener"
&gt;Kubernetes docs — User Namespaces&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://kubernetes.io/docs/concepts/security/pod-security-standards/" target="_blank" rel="noopener"
&gt;Pod Security Standards&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://nvd.nist.gov/vuln/detail/CVE-2019-5736" target="_blank" rel="noopener"
&gt;CVE-2019-5736&lt;/a&gt; — runc: write to host&amp;rsquo;s &lt;code&gt;/proc/self/exe&lt;/code&gt; from inside the container&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://nvd.nist.gov/vuln/detail/CVE-2022-0492" target="_blank" rel="noopener"
&gt;CVE-2022-0492&lt;/a&gt; — cgroups v1: escape via &lt;code&gt;unshare&lt;/code&gt;, UserNS helps but &lt;code&gt;runAsNonRoot&lt;/code&gt; does too&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://nvd.nist.gov/vuln/detail/CVE-2024-21626" target="_blank" rel="noopener"
&gt;CVE-2024-21626&lt;/a&gt; — runc &amp;ldquo;Leaky Vessels&amp;rdquo;: file descriptor leak to host working directory&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/GoogleContainerTools/distroless" target="_blank" rel="noopener"
&gt;Distroless containers — GoogleContainerTools&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>GenAI and software development, episode 2: kubectl-debug-pvc, from idea to krew in 2x30 minutes</title><link>https://blog.zwindler.fr/en/2026/03/17/genai-and-software-development-episode-2-kubectl-debug-pvc-from-idea-to-krew-in-2x30-minutes/</link><pubDate>Tue, 17 Mar 2026 18:00:00 +0100</pubDate><guid>https://blog.zwindler.fr/en/2026/03/17/genai-and-software-development-episode-2-kubectl-debug-pvc-from-idea-to-krew-in-2x30-minutes/</guid><description>&lt;img src="https://blog.zwindler.fr/2026/03/kubectl-debug-pvc.webp" alt="Featured image of post GenAI and software development, episode 2: kubectl-debug-pvc, from idea to krew in 2x30 minutes" /&gt;&lt;h2 id="previously-on-genai-and-dev"&gt;Previously, on &amp;ldquo;GenAI and dev&amp;rdquo;
&lt;/h2&gt;&lt;p&gt;In &lt;a class="link" href="https://blog.zwindler.fr/en/2026/03/08/genai-and-software-development-lessons-learned-with-podsweeper/" &gt;my previous article&lt;/a&gt;, I talked about my experience with PodSweeper, a project developed with OpenCode and Claude Opus. The outcome was mixed: impressive raw speed, but race conditions, lax error handling, and a constant need for human supervision (among other disappointments).&lt;/p&gt;
&lt;p&gt;Today, I&amp;rsquo;m talking about a second project, much simpler, born from a real production need. And the takeaway is quite different.&lt;/p&gt;
&lt;h2 id="the-incident-that-started-it-all"&gt;The incident that started it all
&lt;/h2&gt;&lt;p&gt;You may know this situation: a pod in production, running fine, using a PVC in &lt;code&gt;ReadWriteOnce&lt;/code&gt;. You need to look at the contents of that volume. Production best practice: the pod in question has no shell (we reduce the attack surface). No &lt;code&gt;/bin/sh&lt;/code&gt;, no &lt;code&gt;/bin/bash&lt;/code&gt;, nothing.&lt;/p&gt;
&lt;p&gt;No big deal, we have &lt;code&gt;kubectl debug&lt;/code&gt; for that, right?&lt;/p&gt;
&lt;p&gt;Ok, let&amp;rsquo;s create an ephemeral container in the pod and&amp;hellip; oh wait. &lt;code&gt;kubectl debug&lt;/code&gt; doesn&amp;rsquo;t allow mounting the pod&amp;rsquo;s volumes in the ephemeral container. It simply doesn&amp;rsquo;t expose the &lt;code&gt;volumeMounts&lt;/code&gt; option for ephemeral containers.&lt;/p&gt;
&lt;p&gt;We could kill the pod, which would allow mounting the RWO PVC in another pod with a shell, but we don&amp;rsquo;t want to — it&amp;rsquo;s production.&lt;/p&gt;
&lt;p&gt;My colleague Maxime (him again!) suggested a workaround. The manual procedure is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Create an ephemeral container on the pod with &lt;code&gt;kubectl debug&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Manually build a JSON patch to add &lt;code&gt;volumeMounts&lt;/code&gt; to the ephemeral container we just created&lt;/li&gt;
&lt;li&gt;In another terminal, connect directly to the kube API with &lt;code&gt;kubectl proxy&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Apply the patch via a curl to the Kubernetes API&lt;/li&gt;
&lt;li&gt;Wait for the container to be ready, attach to the container&lt;/li&gt;
&lt;li&gt;Wonder if there&amp;rsquo;s an easier career than patching containers with curl&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;For those who think this is doable, &amp;ldquo;check out this patch&amp;rdquo; as the kids say:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;curl&lt;/span&gt; &lt;span class="err"&gt;http:&lt;/span&gt;&lt;span class="c1"&gt;//localhost:8001/api/v1/namespaces/&amp;lt;namespace&amp;gt;/pods/&amp;lt;pod&amp;gt;/ephemeralcontainers \
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="err"&gt;-X&lt;/span&gt; &lt;span class="err"&gt;PATCH&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="err"&gt;-H&lt;/span&gt; &lt;span class="err"&gt;&amp;#39;Content-Type:&lt;/span&gt; &lt;span class="err"&gt;application/strategic-merge-patch+json&amp;#39;&lt;/span&gt; &lt;span class="err"&gt;\&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="err"&gt;-d&lt;/span&gt; &lt;span class="err"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;spec&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;ephemeralContainers&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;name&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;debugger&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;image&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;ubuntu&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;command&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;/bin/sh&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;targetContainerName&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;&amp;lt;target-container&amp;gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;stdin&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;tty&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;volumeMounts&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;name&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;&amp;lt;volume-name&amp;gt;&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;mountPath&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;/debug/&amp;lt;volume-name&amp;gt;&amp;#34;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="err"&gt;&amp;#39;&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you think this is error prone, you&amp;rsquo;re right. And if you think it&amp;rsquo;s painful to do under pressure during a production incident, you&amp;rsquo;re even more right.&lt;/p&gt;
&lt;h2 id="side-note"&gt;Side note
&lt;/h2&gt;&lt;p&gt;The more seasoned among you with recent Kubernetes versions may have heard about &lt;a class="link" href="https://kep.k8s.io/2590" target="_blank" rel="noopener"
&gt;KEP-2590&lt;/a&gt;, which introduces a (relatively) new &lt;code&gt;--subresource&lt;/code&gt; flag for &lt;code&gt;kubectl&lt;/code&gt;. Indeed, ephemeral containers created by &lt;code&gt;kubectl debug&lt;/code&gt; are not &lt;em&gt;resources&lt;/em&gt; (a pod, a deployment, &amp;hellip;) but &lt;em&gt;subresources&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Until version 1.33, it was literally impossible to perform operations on subresources with &lt;code&gt;kubectl&lt;/code&gt;, only resources.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Kubernetes v1.33: Octarine&lt;/strong&gt; brings support for the &amp;ndash;subresource flag, &lt;strong&gt;but only&lt;/strong&gt; for 3 subresources for now: &lt;code&gt;status&lt;/code&gt;, &lt;code&gt;scale&lt;/code&gt; and &lt;code&gt;resize&lt;/code&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://kubernetes.io/blog/2025/04/23/kubernetes-v1-33-release/#subresource-support-in-kubectl" target="_blank" rel="noopener"
&gt;https://kubernetes.io/blog/2025/04/23/kubernetes-v1-33-release/#subresource-support-in-kubectl&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://kubernetes.io/docs/reference/kubectl/conventions/#subresources" target="_blank" rel="noopener"
&gt;https://kubernetes.io/docs/reference/kubectl/conventions/#subresources&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;No solution on that front either&amp;hellip;&lt;/p&gt;
&lt;h2 id="the-meeting-prompt"&gt;The meeting prompt
&lt;/h2&gt;&lt;p&gt;I had this idea in mind since the incident. Monday morning, at the start of a meeting (while people were still making jokes), I launched OpenCode with Claude Opus and typed a prompt describing:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The problem: &lt;code&gt;kubectl debug&lt;/code&gt; doesn&amp;rsquo;t mount PVC volumes if they&amp;rsquo;re RWO&lt;/li&gt;
&lt;li&gt;The manual procedure I was doing by hand (the painful steps described above)&lt;/li&gt;
&lt;li&gt;What I wanted: a tool that automates all of that&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;OpenCode + Opus asked me 2 questions (and I added one instruction):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;Which language?&amp;rdquo;&lt;/em&gt; → &lt;strong&gt;Go&lt;/strong&gt; (I insist)&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&amp;ldquo;CLI only or TUI?&amp;rdquo;&lt;/em&gt; → &lt;strong&gt;Both&lt;/strong&gt; (non-interactive mode for scripting, TUI for daily use)&lt;/li&gt;
&lt;li&gt;And I asked it to always run the &lt;code&gt;golangci-lint&lt;/code&gt; linter every time it considers being done (trauma from my previous test with OpenCode)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It went on its own with the idea of a TUI using &lt;a class="link" href="https://github.com/charmbracelet/bubbletea" target="_blank" rel="noopener"
&gt;Bubble Tea&lt;/a&gt;. Then I stopped watching and followed my meeting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;~30 minutes later, end of the meeting&lt;/strong&gt;, I looked at the result. The POC was functional.&lt;/p&gt;
&lt;h2 id="what-opus-produced-in-30-minutes"&gt;What Opus produced in 30 minutes
&lt;/h2&gt;&lt;p&gt;Without any intervention on my part, the LLM scaffolded a complete Go project:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Clean structure&lt;/strong&gt;: &lt;code&gt;cmd/&lt;/code&gt; for the Cobra CLI, &lt;code&gt;pkg/k8s/&lt;/code&gt; for all Kubernetes interaction, &lt;code&gt;pkg/tui/&lt;/code&gt; for the Bubble Tea TUI&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;The core of the matter&lt;/strong&gt;: a direct call to the Kubernetes API to patch the &lt;code&gt;ephemeralcontainers&lt;/code&gt; subresource of the pod with a strategic merge patch including the &lt;code&gt;volumeMounts&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Non-interactive mode&lt;/strong&gt;: &lt;code&gt;-n namespace -p pod -v volume:/mount/path&lt;/code&gt;, ready for scripting&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Complete TUI&lt;/strong&gt;: vim-style navigation (&lt;code&gt;j&lt;/code&gt;/&lt;code&gt;k&lt;/code&gt;), fuzzy filtering (&lt;code&gt;/&lt;/code&gt;), multi-selection of volumes, loading spinner&amp;hellip;&lt;/li&gt;
&lt;li&gt;Makefile with a whole bunch of targets (vet, lint, test, build, install)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What I asked it to add:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Smart discovery&lt;/strong&gt;: instead of scanning all pods in all namespaces (slow on one of my large clusters), the tool first lists PVCs cluster-wide (a single API call) to identify relevant namespaces, THEN the pods in the selected namespace. This drastically reduces the number of calls to the kube API server and the TUI only shows namespaces and pods that have PVC volumes&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;SecurityContext inheritance&lt;/strong&gt;: the ephemeral container copies the &lt;code&gt;securityContext&lt;/code&gt; from the target container, allowing it to pass PodSecurity policies (&lt;code&gt;restricted&lt;/code&gt;, &lt;code&gt;baseline&lt;/code&gt;&amp;hellip;). This was a case I hadn&amp;rsquo;t anticipated in my initial prompt and it immediately failed on a properly configured cluster.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Without these small improvements the tool was already usable (except when you have SecurityContext configured), but it&amp;rsquo;s much more comfortable in practice (because yes, I do use it).&lt;/p&gt;
&lt;p&gt;The code for the core mechanism (the API patch) is a few hundred lines of clean Go. It retrieves the pod, builds the JSON patch with the ephemeral container and its &lt;code&gt;volumeMounts&lt;/code&gt;, applies it via &lt;code&gt;Patch()&lt;/code&gt; on the subresource, waits for the container to be Running, then launches &lt;code&gt;kubectl attach&lt;/code&gt;. Nothing fancy, it&amp;rsquo;s just what&amp;rsquo;s needed, done right the first time.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-golang" data-lang="golang"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ephemeralContainer&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;:=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;corev1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;EphemeralContainer&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;EphemeralContainerCommon&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;corev1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;EphemeralContainerCommon&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;containerName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Command&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;&amp;#34;/bin/sh&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;Stdin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;TTY&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;VolumeMounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;mounts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;SecurityContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;targetSecurityContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;TargetContainerName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;targetContainer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="o"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Clientset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CoreV1&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;Pods&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Namespace&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;Patch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;opts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PodName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;types&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;StrategicMergePatchType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;patchBytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;metav1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PatchOptions&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;ephemeralcontainers&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="the-following-iterations-30-minutes"&gt;The following iterations (~30 minutes)
&lt;/h2&gt;&lt;p&gt;Once the POC was validated, I chained a few targeted prompts. I asked it what could be improved. It suggested (and implemented):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A few minor code fixes it had gotten wrong (but non-blocking)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Warnings&lt;/strong&gt;: if the inherited SecurityContext has &lt;code&gt;readOnlyRootFilesystem&lt;/code&gt;, &lt;code&gt;runAsNonRoot&lt;/code&gt;, or other security measures, the tool warns the user before attach&lt;/li&gt;
&lt;li&gt;Added a complete CI based on &lt;code&gt;goreleaser&lt;/code&gt; and GitHub Actions&lt;/li&gt;
&lt;li&gt;Added documentation&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="the-cherry-on-top-publishing-on-krew"&gt;The cherry on top: publishing on Krew
&lt;/h2&gt;&lt;p&gt;Once I had a clean version, I asked Opus how to make the plugin easier to install. It suggested &lt;code&gt;brew&lt;/code&gt;, or &lt;a class="link" href="https://krew.sigs.k8s.io/" target="_blank" rel="noopener"
&gt;Krew&lt;/a&gt;, the kubectl plugin manager.&lt;/p&gt;
&lt;p&gt;I asked if the plugin acceptance process was complex. It said no, I said &amp;ldquo;I dare you!&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;It &lt;strong&gt;completed the entire process&lt;/strong&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Created a fork of &lt;a class="link" href="https://github.com/kubernetes-sigs/krew-index" target="_blank" rel="noopener"
&gt;krew-index&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Wrote the Krew manifest (the &lt;code&gt;.yaml&lt;/code&gt; file describing the plugin)&lt;/li&gt;
&lt;li&gt;Took into account all the best practices requested by krew-index maintainers (download URL format, short descriptions, SHA256 checksums, etc.)&lt;/li&gt;
&lt;li&gt;Prepared the PR, directly on krew-index&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;I just watched. The PR was merged 12 hours later by the maintainers.&lt;/p&gt;
&lt;p&gt;Result: &lt;code&gt;kubectl krew install debug-pvc&lt;/code&gt; works.&lt;/p&gt;
&lt;h2 id="minor-failure--delegating-the-demo-too"&gt;Minor failure — delegating the demo too
&lt;/h2&gt;&lt;p&gt;Emboldened by this success with the krew-index PR, I wondered if we could make a visual demo (a GIF to add to the project&amp;rsquo;s README.md showing how it works) with the LLM.&lt;/p&gt;
&lt;p&gt;I asked if it could do it with &lt;code&gt;asciinema&lt;/code&gt; and the LLM answered &amp;ldquo;yes&amp;rdquo; (yes, I chat with the LLM like I do with my colleagues). Deal, we tried.&lt;/p&gt;
&lt;p&gt;The result was &lt;em&gt;almost&lt;/em&gt; good, I could have settled for it if I weren&amp;rsquo;t such a nitpicker: it was a bit slow. I felt the LLM&amp;rsquo;s responsiveness in its actions was uneven, which made the viewing experience a bit unpleasant. I could have iterated until I got something correct, but I ultimately recorded a video myself, converted to GIF with &lt;code&gt;ffmpeg&lt;/code&gt;. It was smoother and faster than iterating.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/03/kubectl-debug-pvc-demo.gif"
loading="lazy"
&gt;&lt;/p&gt;
&lt;h2 id="the-difference-with-podsweeper"&gt;The difference with PodSweeper
&lt;/h2&gt;&lt;p&gt;The difference in experience between this project and PodSweeper is striking. Where PodSweeper was a constant fight (race conditions, LLM amnesia, out-of-spec features), &lt;code&gt;kubectl-debug-pvc&lt;/code&gt; went smoothly without any notable hiccups.&lt;/p&gt;
&lt;p&gt;Why? I think it comes down to the nature of the project:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;One single thing to do&lt;/strong&gt;, clearly defined: patch a Kubernetes subresource with volumeMounts&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No concurrent logic&lt;/strong&gt;: we make an API request, we wait, we attach&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;A well-documented domain&lt;/strong&gt;: the Kubernetes API, Cobra, Bubble Tea. The LLM knows all of this by heart&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;No complex shared state&lt;/strong&gt;: no goroutines stepping on each other, no graceful shutdown to manage, no multiple microservices&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Limited scope&lt;/strong&gt;: the entire project fits in about fifteen files, CI and documentation included&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is probably the ideal playground for an LLM. A well-defined problem, a well-charted technical domain, a linear solution.&lt;/p&gt;
&lt;h2 id="what-do-i-think-about-it"&gt;What do I think about it?
&lt;/h2&gt;&lt;p&gt;Objectively, this kind of &amp;ldquo;simple&amp;rdquo; project (one thing to do, easy to understand) works &lt;strong&gt;really well&lt;/strong&gt; with OpenCode + Opus. I think it&amp;rsquo;s because of this kind of small project that the hype is so strong around agentic development. You have an idea you were too lazy to dev, you test it, it works. &amp;ldquo;WOW&amp;rdquo;.&lt;/p&gt;
&lt;p&gt;But I don&amp;rsquo;t want to downplay the work done either. The fact that a functional, clean tool, published on Krew and usable by anyone could emerge from a prompt launched at the beginning of a meeting is still pretty wild.&lt;/p&gt;
&lt;p&gt;A bit scary too, thinking that an insane number of micro tools are going to appear in the coming months, with probably uneven quality.&lt;/p&gt;
&lt;h2 id="the-project"&gt;The project
&lt;/h2&gt;&lt;p&gt;The code is available here:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/zwindler/kubectl-debug-pvc" target="_blank" rel="noopener"
&gt;https://github.com/zwindler/kubectl-debug-pvc&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Installation:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Via Krew (recommended)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl krew install debug-pvc
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Interactive usage (TUI)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl debug-pvc
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Non-interactive usage&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl debug-pvc -n my-namespace -p my-pod-0 -v data:/debug/data
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;If you&amp;rsquo;ve ever struggled to inspect a PVC on a pod without a shell, give it a try. And if you find bugs, you can blame the LLM 😄.&lt;/p&gt;</description></item><item><title>Flannel and NetworkPolicies: how to add support with Cilium in CNI chaining mode</title><link>https://blog.zwindler.fr/en/2026/03/15/flannel-and-networkpolicies-how-to-add-support-with-cilium-in-cni-chaining-mode/</link><pubDate>Sun, 15 Mar 2026 18:00:00 +0200</pubDate><guid>https://blog.zwindler.fr/en/2026/03/15/flannel-and-networkpolicies-how-to-add-support-with-cilium-in-cni-chaining-mode/</guid><description>&lt;img src="https://blog.zwindler.fr/2026/03/flannel-cilium-chaining.webp" alt="Featured image of post Flannel and NetworkPolicies: how to add support with Cilium in CNI chaining mode" /&gt;&lt;h2 id="flannel-flannel-flannel"&gt;Flannel, flannel, flannel&amp;hellip;
&lt;/h2&gt;&lt;p&gt;Flannel is a simple and popular CNI 😢.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s the default CNI in k3s, the one half the kubeadm tutorials use, and you&amp;rsquo;ll find it in quite a few managed offerings too.&lt;/p&gt;
&lt;p&gt;OK, it&amp;rsquo;s simple, it routes packets between pods, it supports VXLAN and WireGuard, it takes 2 minutes to set up. What more could you ask for?&lt;/p&gt;
&lt;p&gt;Well, exactly. There&amp;rsquo;s one thing flannel does &lt;strong&gt;not&lt;/strong&gt; do: NetworkPolicies. (And that&amp;rsquo;s a big deal).&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Flannel is focused on networking. For network policy, other projects such as Calico can be used.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;strong&gt;[Edit]&lt;/strong&gt; Contrary to what I wrote here, flannel actually does support NetworkPolicies and has for at least 2 years, via the reference implementation &lt;a class="link" href="https://github.com/kubernetes-sigs/kube-network-policies" target="_blank" rel="noopener"
&gt;kube-network-policies&lt;/a&gt; from the Kubernetes project. The option isn&amp;rsquo;t prominently featured in the README and is probably not more recommended for production, but it does exist. See the &lt;a class="link" href="https://github.com/flannel-io/flannel/blob/master/Documentation/netpol.md" target="_blank" rel="noopener"
&gt;official documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;And the trap is that if you haven&amp;rsquo;t read that one little line in the &lt;strong&gt;README.md&lt;/strong&gt;, nothing tells you explicitly.&lt;/p&gt;
&lt;p&gt;You can perfectly well create &lt;code&gt;NetworkPolicy&lt;/code&gt; objects in your cluster, &lt;code&gt;kubectl apply&lt;/code&gt; won&amp;rsquo;t complain, &lt;code&gt;kubectl get netpol&lt;/code&gt; will happily list them. Except&amp;hellip; they&amp;rsquo;re not enforced. Traffic still flows. Your deny-all doesn&amp;rsquo;t deny anything at all.&lt;/p&gt;
&lt;h2 id="lets-verify-to-be-sure"&gt;Let&amp;rsquo;s verify to be sure
&lt;/h2&gt;&lt;p&gt;Before solving anything, let&amp;rsquo;s verify the problem exists. Starting from the prerequisite that we have a Kubernetes cluster with flannel as the CNI and everything is working, we&amp;rsquo;ll deploy two pods in two different namespaces: a client (curl) and a server (nginx).&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s check that the client can reach the server:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; -n netpol-test-a client -- &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; curl -s --max-time &lt;span class="m"&gt;5&lt;/span&gt; http://server.netpol-test-b.svc.cluster.local
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;We get the nginx welcome page. So far, so normal. Now, let&amp;rsquo;s apply a &lt;em&gt;deny-all&lt;/em&gt; NetworkPolicy on the server&amp;rsquo;s namespace:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# 01-deny-all-ingress.yaml&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;networking.k8s.io/v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;NetworkPolicy&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;deny-all-ingress&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;netpol-test-b&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;podSelector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;{}&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;policyTypes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="l"&gt;Ingress&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl apply -f 01-deny-all-ingress.yaml
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;And let&amp;rsquo;s test again:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; -n netpol-test-a client -- &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; curl -s --max-time &lt;span class="m"&gt;5&lt;/span&gt; http://server.netpol-test-b.svc.cluster.local
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Result: the nginx page still shows up.&lt;/strong&gt; The NetworkPolicy was created (&lt;code&gt;kubectl get netpol -n netpol-test-b&lt;/code&gt; shows it), but it&amp;rsquo;s not enforced. Traffic flows as if nothing happened.&lt;/p&gt;
&lt;p&gt;This is the expected behavior with flannel (by default). Flannel only does L3 routing (VXLAN or WireGuard overlay). It doesn&amp;rsquo;t implement a NetworkPolicy controller. The objects exist in etcd, but nobody translates them into filtering rules.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s clean up the policy before moving on:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl delete -f 01-deny-all-ingress.yaml
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h2 id="alternatives-for-adding-support"&gt;Alternatives for adding support
&lt;/h2&gt;&lt;p&gt;For a long time I thought this was just the way it was. I still think it&amp;rsquo;s not a good choice for any production.&lt;/p&gt;
&lt;p&gt;BUT recently, I discovered that it was possible to chain CNIs within the same cluster, and thus have one CNI handling most tasks (here Flannel) and another one taking care of other tasks, such as enforcing NetworkPolicies.&lt;/p&gt;
&lt;p&gt;This is actually the principle behind Canal, which I knew by name but had never explored. In fact, it&amp;rsquo;s nothing more than a manifest that deploys Flannel as the main CNI with Calico for NetworkPolicy enforcement!&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Canal was the name of Tigera and CoreOS&amp;rsquo;s project to integrate Calico and flannel.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Note: &lt;a class="link" href="https://github.com/projectcalico/canal?tab=readme-ov-file" target="_blank" rel="noopener"
&gt;the GitHub project was archived in October 2025&lt;/a&gt; but in theory it should still work, if you can find the correct documentation (I couldn&amp;rsquo;t, the links are broken, but I didn&amp;rsquo;t really look that hard either).&lt;/p&gt;
&lt;p&gt;So you get the idea: we&amp;rsquo;re going to add a component that will &lt;strong&gt;watch&lt;/strong&gt; NetworkPolicy objects and translate them into effective filtering rules (iptables, eBPF, nftables&amp;hellip;), &lt;strong&gt;without touching the existing flannel setup&lt;/strong&gt;. This is called &amp;ldquo;CNI chaining&amp;rdquo; or &amp;ldquo;policy-only&amp;rdquo; mode.&lt;/p&gt;
&lt;p&gt;There are several solutions:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Calico (Canal)&lt;/strong&gt;; Historically, the flannel + Calico combination is called &amp;ldquo;Canal&amp;rdquo;, which I just mentioned. &lt;strong&gt;But&lt;/strong&gt; the official Canal manifest bundles its own flannel in the same DaemonSet as calico-node. If your flannel is already installed and managed (by you, by an operator, by a provider&amp;hellip;), you probably don&amp;rsquo;t want to replace it. And the Tigera operator (the &amp;ldquo;official&amp;rdquo; Helm method) doesn&amp;rsquo;t support policy-only deployment on an existing flannel either. In short, it&amp;rsquo;s doable but requires some effort. Can&amp;rsquo;t be bothered.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;kube-router&lt;/strong&gt;; kube-router can run in firewall-only mode (&lt;code&gt;--run-firewall=true&lt;/code&gt;) and only needs iptables/ipset. This is actually what k3s uses by default for NetworkPolicies. It&amp;rsquo;s the lightest solution (supposedly ~50 MB of RAM per node). Make sure your kernel has the &lt;code&gt;ip_set&lt;/code&gt; module, otherwise it won&amp;rsquo;t work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cilium in CNI chaining mode&lt;/strong&gt;; This is the solution I chose and that we&amp;rsquo;ll detail here. Cilium attaches to the veth interfaces created by flannel and adds its eBPF programs for policy enforcement. No dependency on iptables or ipset, and as a bonus we get Hubble for network observability.&lt;/p&gt;
&lt;h2 id="installing-cilium-in-cni-chaining-mode"&gt;Installing Cilium in CNI chaining mode
&lt;/h2&gt;&lt;h3 id="prerequisites"&gt;Prerequisites
&lt;/h3&gt;&lt;ul&gt;
&lt;li&gt;A working Kubernetes cluster with flannel&lt;/li&gt;
&lt;li&gt;&lt;code&gt;helm&lt;/code&gt; v3+&lt;/li&gt;
&lt;li&gt;A kernel &amp;gt;= 4.19 (ideally &amp;gt;= 5.10 for all eBPF features)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id="retrieve-flannels-cni-configuration"&gt;Retrieve flannel&amp;rsquo;s CNI configuration
&lt;/h3&gt;&lt;p&gt;Cilium in chaining mode needs to know the existing CNI configuration. Let&amp;rsquo;s retrieve it from a node:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl debug node/&lt;span class="k"&gt;$(&lt;/span&gt;kubectl get nodes -o &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;{.items[0].metadata.name}&amp;#39;&lt;/span&gt;&lt;span class="k"&gt;)&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; -it --image&lt;span class="o"&gt;=&lt;/span&gt;busybox -- cat /host/etc/cni/net.d/10-flannel.conflist
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;On my cluster, this gives:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-json" data-lang="json"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;name&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;cbr0&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;cniVersion&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;0.3.1&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;plugins&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;type&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;flannel&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;delegate&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;hairpinMode&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;isDefaultGateway&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;type&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;#34;portmap&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;capabilities&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="nt"&gt;&amp;#34;portMappings&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Note the &lt;code&gt;name&lt;/code&gt; field&lt;/strong&gt; (here &lt;code&gt;cbr0&lt;/code&gt;). We&amp;rsquo;ll need it.&lt;/p&gt;
&lt;h3 id="create-the-chaining-configmap"&gt;Create the chaining ConfigMap
&lt;/h3&gt;&lt;p&gt;We&amp;rsquo;ll create a ConfigMap that takes the flannel config and adds the &lt;code&gt;cilium-cni&lt;/code&gt; plugin in chaining mode:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;v1&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;ConfigMap&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;cni-configuration&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;kube-system&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;cni-config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;|-&lt;/span&gt;&lt;span class="sd"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;name&amp;#34;: &amp;#34;cbr0&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;cniVersion&amp;#34;: &amp;#34;0.3.1&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;plugins&amp;#34;: [
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;type&amp;#34;: &amp;#34;flannel&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;delegate&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;hairpinMode&amp;#34;: true,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;isDefaultGateway&amp;#34;: true
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;type&amp;#34;: &amp;#34;portmap&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;capabilities&amp;#34;: {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;portMappings&amp;#34;: true
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; },
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; {
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;type&amp;#34;: &amp;#34;cilium-cni&amp;#34;,
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; &amp;#34;chaining-mode&amp;#34;: &amp;#34;generic-veth&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; }
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; ]
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="sd"&gt; }&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Warning&lt;/strong&gt;: the &lt;code&gt;name&lt;/code&gt; field must match your flannel &lt;em&gt;conflist&lt;/em&gt;. If yours is called &lt;code&gt;cni0&lt;/code&gt; or something else, adjust accordingly.&lt;/p&gt;
&lt;p&gt;Before applying, also verify that your CNI uses &lt;strong&gt;veth&lt;/strong&gt; interfaces (this is the default with flannel, but better safe than sorry). From a node:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;ip -d link &lt;span class="p"&gt;|&lt;/span&gt; grep veth
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;You should see veth-type interfaces corresponding to your pods, for example:&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;103: lxcb3901b7f9c02@if102: &amp;lt;BROADCAST,MULTICAST,UP,LOWER_UP&amp;gt; ...
veth addrgenmode eui64 numtxqueues 1 numrxqueues 1
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If that&amp;rsquo;s the case, Cilium&amp;rsquo;s &lt;code&gt;generic-veth&lt;/code&gt; mode will work. Let&amp;rsquo;s apply:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl apply -f cilium-cni-configmap.yaml
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="install-cilium-via-helm"&gt;Install Cilium via Helm
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;helm repo add cilium https://helm.cilium.io/
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;helm repo update
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Here are the values for chaining mode:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# cilium-values.yaml&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;cni&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;chainingMode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;generic-veth&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;customConf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;configMap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;cni-configuration&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;install&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;routingMode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l"&gt;native&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;enableIPv4Masquerade&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;enableIPv6Masquerade&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;hubble&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;relay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;ui&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;The important points:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;cni.chainingMode: generic-veth&lt;/code&gt;, this is chaining mode, Cilium attaches to existing veth interfaces&lt;/li&gt;
&lt;li&gt;&lt;code&gt;cni.customConf: true&lt;/code&gt; + &lt;code&gt;cni.configMap&lt;/code&gt;, we provide our own CNI config&lt;/li&gt;
&lt;li&gt;&lt;code&gt;routingMode: native&lt;/code&gt;, flannel handles routing, not Cilium&lt;/li&gt;
&lt;li&gt;&lt;code&gt;enableIPv4Masquerade: false&lt;/code&gt;, flannel handles masquerading&lt;/li&gt;
&lt;li&gt;&lt;code&gt;hubble.enabled: true&lt;/code&gt;, network observability, the big bonus of Cilium&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;helm install cilium cilium/cilium --version 1.19.1 &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; --namespace kube-system &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; -f cilium-values.yaml
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Wait for everything to be ready:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl rollout status daemonset/cilium -n kube-system --timeout&lt;span class="o"&gt;=&lt;/span&gt;120s
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;h3 id="verification"&gt;Verification
&lt;/h3&gt;&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; -n kube-system ds/cilium -- cilium status
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;What we&amp;rsquo;re looking for:&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;Kubernetes: Ok 1.35 (v1.35.0) [linux/amd64]
CNI Chaining: generic-veth
Cilium: Ok 1.19.1
Hubble: Ok
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The &lt;code&gt;CNI Chaining: generic-veth&lt;/code&gt; line confirms that Cilium is running in &lt;em&gt;chaining&lt;/em&gt; mode and isn&amp;rsquo;t replacing flannel.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: pods that existed before Cilium was installed are not automatically managed by Cilium. You need to restart them so that Cilium can attach its eBPF programs. Remember to &lt;code&gt;kubectl rollout restart&lt;/code&gt; your test workloads (or recreate them).&lt;/p&gt;
&lt;h2 id="testing-networkpolicies"&gt;Testing NetworkPolicies
&lt;/h2&gt;&lt;p&gt;This is the moment of truth. Let&amp;rsquo;s re-apply our deny-all:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl apply -f 01-deny-all-ingress.yaml
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; -n netpol-test-a client -- &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; curl -s --max-time &lt;span class="m"&gt;5&lt;/span&gt; http://server.netpol-test-b.svc.cluster.local
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Result: timeout&lt;/strong&gt;! This time, the NetworkPolicy is properly enforced. Traffic is blocked.&lt;/p&gt;
&lt;p&gt;I followed up with the other classic NetworkPolicy scenarios, and everything works:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Selective ingress allow by namespace&lt;/strong&gt;; by adding a policy that allows traffic from &lt;code&gt;netpol-test-a&lt;/code&gt; only, curl works from that namespace but remains blocked from &lt;code&gt;default&lt;/code&gt;. Namespace isolation works.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deny-all egress&lt;/strong&gt;; by blocking all outgoing traffic from the client, even DNS resolution is blocked (immediate timeout).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Selective egress allow&lt;/strong&gt;; by allowing only DNS (port 53) and the server (port 80 in the &lt;code&gt;netpol-test-b&lt;/code&gt; namespace), curl to the server works but &lt;code&gt;curl http://example.com&lt;/code&gt; remains blocked.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In short, ingress, egress, selective by namespace, everything works as expected.&lt;/p&gt;
&lt;h2 id="bonus-hubble-network-observability"&gt;Bonus: Hubble, network observability
&lt;/h2&gt;&lt;p&gt;This is for me the real advantage of Cilium over the alternatives. Hubble lets you see network flows and policy verdicts in real time:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl &lt;span class="nb"&gt;exec&lt;/span&gt; -n kube-system ds/cilium -- &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; hubble observe --namespace netpol-test-b --last &lt;span class="m"&gt;5&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;pre tabindex="0"&gt;&lt;code&gt;Mar 15 13:20:42.287: netpol-test-a/client:40066 (ID:9745) -&amp;gt;
netpol-test-b/server:80 (ID:22271)
policy-verdict:none ALLOWED (TCP Flags: SYN)
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You can see the source pod, destination pod, port, Cilium identity, and policy verdict. When you&amp;rsquo;re debugging a NetworkPolicy that isn&amp;rsquo;t behaving as expected, this is incredibly useful.&lt;/p&gt;
&lt;h2 id="how-much-does-it-cost-in-resources"&gt;How much does it cost in resources?
&lt;/h2&gt;&lt;p&gt;On my cluster (3 nodes), here&amp;rsquo;s what Cilium consumes right after installation:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Per node&lt;/th&gt;
&lt;th&gt;RAM&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;cilium agent&lt;/td&gt;
&lt;td&gt;yes (DaemonSet)&lt;/td&gt;
&lt;td&gt;~160 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;cilium-envoy&lt;/td&gt;
&lt;td&gt;yes (DaemonSet)&lt;/td&gt;
&lt;td&gt;~22 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;cilium-operator&lt;/td&gt;
&lt;td&gt;no (2 replicas)&lt;/td&gt;
&lt;td&gt;~42 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hubble-relay&lt;/td&gt;
&lt;td&gt;no (1 replica)&lt;/td&gt;
&lt;td&gt;~16 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hubble-ui&lt;/td&gt;
&lt;td&gt;no (1 replica)&lt;/td&gt;
&lt;td&gt;~21 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;That&amp;rsquo;s roughly &lt;strong&gt;180 MB per node&lt;/strong&gt; for the agent + envoy. I don&amp;rsquo;t have a point of comparison with Calico or kube-router, but it seems acceptable to me, and being able to have full visibility into all network flows with Hubble more than justifies the overhead (in my opinion).&lt;/p&gt;
&lt;h2 id="conclusion"&gt;Conclusion
&lt;/h2&gt;&lt;p&gt;If you have no choice and have to work with flannel, and you want to plug the gaping security hole that is the lack of NetworkPolicy enforcement, know that it is possible to chain another CNI to handle it.&lt;/p&gt;
&lt;p&gt;Short of having it as CNI for everything, Cilium in CNI chaining mode (generic-veth) is a pretty nice solution to fill this gap. It doesn&amp;rsquo;t touch the existing flannel setup, it grafts onto it. And as a bonus, you get Hubble for network observability, which is genuinely valuable.&lt;/p&gt;</description></item><item><title>GenAI and software development: lessons learned with PodSweeper</title><link>https://blog.zwindler.fr/en/2026/03/08/genai-and-software-development-lessons-learned-with-podsweeper/</link><pubDate>Sun, 08 Mar 2026 18:00:00 +0200</pubDate><guid>https://blog.zwindler.fr/en/2026/03/08/genai-and-software-development-lessons-learned-with-podsweeper/</guid><description>&lt;img src="https://blog.zwindler.fr/2026/02/opencode.webp" alt="Featured image of post GenAI and software development: lessons learned with PodSweeper" /&gt;&lt;h2 id="genai-is-it-fantastic"&gt;GenAI, is it fantastic?
&lt;/h2&gt;&lt;p&gt;Quite a few people have shared their takes on GenAI for development in a very short time, so I realize it&amp;rsquo;s more than time I post this draft I started over two weeks ago 🙃.&lt;/p&gt;
&lt;p&gt;For work, I use AI assistants more and more to help me with my daily tasks. In 2024, it was mainly for automating tedious tasks (scripting stuff, making a painful list of repetitive tasks without coding it properly). Throughout 2025, I tested it several times for Ops work, and each time the results were mediocre, both for designing coherent, reliable and efficient infrastructure and for incident resolution assistance (see &lt;a class="link" href="https://blog.zwindler.fr/en/2025/08/15/ops-disparition-suite/#et-dans-linfra-" &gt;my article on the topic&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;In 2026, on that last point, I feel we still haven&amp;rsquo;t progressed, though I&amp;rsquo;ll admit there are specialized tools I haven&amp;rsquo;t tested enough yet, open source or not. I can mention &lt;a class="link" href="https://k8sgpt.ai/" target="_blank" rel="noopener"
&gt;k8sgpt&lt;/a&gt; and &lt;a class="link" href="https://github.com/HolmesGPT/holmesgpt" target="_blank" rel="noopener"
&gt;HolmesGPT&lt;/a&gt; for open source, and &lt;a class="link" href="https://www.anyshift.io/" target="_blank" rel="noopener"
&gt;Anyshift&lt;/a&gt; with its SRE agent for root cause detection and incident resolution.&lt;/p&gt;
&lt;p&gt;Note: fun fact, in his post &amp;ldquo;&lt;a class="link" href="https://alex.balmes.co/fr/blog/mon-positionnement-sur-l-intelligence-artificielle-generative#pour-les-ops" target="_blank" rel="noopener"
&gt;My position on Generative Artificial Intelligence&lt;/a&gt;&amp;rdquo; posted just before mine, Alex cites a long list of professions that will benefit from GenAI, and ops folks get nothing more than yet another argument to switch to Kubernetes 😭.&lt;/p&gt;
&lt;p&gt;On the other hand, I&amp;rsquo;ve started using GenAI to build several web projects from scratch (&lt;em&gt;vibe coding&lt;/em&gt;?), which I&amp;rsquo;ve already told you about:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://blog.zwindler.fr/2025/12/02/jai-donn%C3%A9-1-heure-%C3%A0-des-agents-copilot-pour-migrer-un-site-de-bloggrify-%C3%A0-hugo/" &gt;Converting a blog from Bloggrify to Hugo (french only)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://blog.zwindler.fr/en/2026/02/09/101-ways-to-deploy-kubernetes-a-brand-new-ui-to-explore-118-solutions/" &gt;Creating a &amp;ldquo;pretty&amp;rdquo; website&lt;/a&gt; (way beyond my skills) to host the research I&amp;rsquo;ve done on the different ways to deploy Kubernetes&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://blog.zwindler.fr/2026/02/20/securite-headers-http-observatory-hugo/" &gt;Improving my website&amp;rsquo;s security. (french only)&lt;/a&gt; by automating modifications to the Hugo theme I use, to get the best score on Mozilla HTTP Observatory&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In all 3 cases, the result is there. Or rather, it &lt;em&gt;seems&lt;/em&gt; to be for the layman that I am.&lt;/p&gt;
&lt;p&gt;I&amp;rsquo;ve never hidden that I have zero front-end skills, and maybe for a domain expert, what I did with AI is horrible (or not?). In any case, it can&amp;rsquo;t be worse than the UIs I made without it. Small example 😄:&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/02/groroti.avif"
loading="lazy"
&gt;&lt;/p&gt;
&lt;h2 id="or-rather-gêneai-sorry-that-a-french-play-on-word-i-cant-translate"&gt;or rather: GêneAI (sorry that a french play on word I can&amp;rsquo;t translate)?
&lt;/h2&gt;&lt;p&gt;This is the kind of feedback you see from people who are &lt;em&gt;actually&lt;/em&gt; experts in a domain when they watch novices get excited. We have good examples with AI-generated videos: if you&amp;rsquo;re not paying attention, you think it&amp;rsquo;s incredible. But anyone with a slightly critical eye immediately sees &lt;strong&gt;BIG&lt;/strong&gt; consistency problems. Same goes for image generation, or music.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s also true for code, and we have great examples of &lt;em&gt;vibe coded&lt;/em&gt; projects that are horrific spaghetti messes and cybersecurity sieves. In short, you get it: even though I&amp;rsquo;m a daily user, I&amp;rsquo;m not &amp;ldquo;amazified&amp;rdquo; (as my kids would say) by LLMs, even the best ones.&lt;/p&gt;
&lt;p&gt;Yet, I have several friends who swear by them, including people far more brilliant/intelligent (call it what you will) than me. So I thought &lt;em&gt;&amp;ldquo;OK, maybe I&amp;rsquo;m doing it wrong. Let&amp;rsquo;s try with the best there is&amp;rdquo;&lt;/em&gt;: a &amp;ldquo;Claude Code&amp;rdquo;-type IDE and an Opus model.&lt;/p&gt;
&lt;h2 id="the-editor"&gt;The editor
&lt;/h2&gt;&lt;p&gt;After a quick market survey, you realize the options are plentiful: Claude Code, OpenCode, Amp Code, &amp;hellip;&lt;/p&gt;
&lt;p&gt;The problem with Claude Code is the entry price. I&amp;rsquo;m not yet ready to spend 100 or 200€ per month for the use I have today. I don&amp;rsquo;t know if the 20€ plan would be enough. Beyond the rare personal side projects I mentioned above, I don&amp;rsquo;t code much. Most of my geeky activities are infrastructure: installing Kubernetes clusters and virtualization OSes. Stuff that&amp;rsquo;s hard to automate with an LLM.&lt;/p&gt;
&lt;p&gt;Pierre (mostly) recommended &lt;a class="link" href="https://ampcode.com/" target="_blank" rel="noopener"
&gt;Amp Code&lt;/a&gt; for personal use, mainly because it has a free tier with a certain number of tokens and if you&amp;rsquo;re not too greedy, it&amp;rsquo;s quite effective. I preferred to do things my own way and test &lt;strong&gt;OpenCode&lt;/strong&gt; instead, which has the advantage of being more flexible with LLM providers and token consumption. It&amp;rsquo;s even possible to use local models (free, therefore) or &lt;a class="link" href="https://opencode.ai/docs/fr/zen/" target="_blank" rel="noopener"
&gt;models included for free (at least for now) such as GLM 5 Free&lt;/a&gt;, which performs quite well among dev models (most importantly, it&amp;rsquo;s free&amp;hellip;).&lt;/p&gt;
&lt;p&gt;OK, we have the editor. Now, the code?&lt;/p&gt;
&lt;p&gt;To get a sense of the relevance of code generated by my favorite LLM (Sonnet and Opus, for a while now), I need a language &lt;strong&gt;and&lt;/strong&gt; a use case that I master, so I can tell it &lt;em&gt;&amp;ldquo;no, that&amp;rsquo;s absolute nonsense!?&amp;rdquo;&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;A use case I master&amp;hellip; as you might guess, there will inevitably be some &lt;strong&gt;Kubernetes&lt;/strong&gt; in there.&lt;/p&gt;
&lt;p&gt;A language I master: I learned Go in 2022 thanks to a colleague at Deezer (thanks again Martial!) and I&amp;rsquo;ve published a few tools and made minor contributions. I wrote a large part of the &lt;a class="link" href="https://github.com/deezer/GroROTI" target="_blank" rel="noopener"
&gt;GroROTI&lt;/a&gt; tool and also started (but never finished) developing an RPG in Go, heavily inspired by Castle of the Winds, a game from my childhood: &lt;a class="link" href="https://github.com/zwindler/gocastle" target="_blank" rel="noopener"
&gt;gocastle&lt;/a&gt;. And I have professional-level proficiency (even if it&amp;rsquo;s not the core of my job) in my day-to-day work.&lt;/p&gt;
&lt;h2 id="after-the-context-the-project"&gt;After the context, the project
&lt;/h2&gt;&lt;p&gt;OK, we have the technical context: Kube + Go. Now we need the idea to implement. The project needs to be large enough for the test to be meaningful, fun enough that I want to spend personal time on it, but still somewhat useful so I&amp;rsquo;d want to talk about it and show progress. And above all, a project whose business logic stays within my reach: to honestly evaluate the quality of AI-generated code, I need to be able to read and critique it.&lt;/p&gt;
&lt;p&gt;And there, in my overflowing drawer of dumb ideas, I remembered this &amp;ldquo;pitch&amp;rdquo; from 2023-2024, which I&amp;rsquo;d left to rot for lack of time:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;PodSweeper&lt;/strong&gt;: the most complex, over-engineered and chaotic way to play Minesweeper.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/02/Wat8.webp"
loading="lazy"
&gt;&lt;/p&gt;
&lt;h2 id="dont-leave-youll-see-its-fun-really"&gt;Don&amp;rsquo;t leave, you&amp;rsquo;ll see it&amp;rsquo;s fun. Really!
&lt;/h2&gt;&lt;p&gt;Instead of a visual grid where you click on cells hoping not to hit a &amp;ldquo;mine&amp;rdquo;, we have a virtual &amp;ldquo;grid&amp;rdquo; where each cell is a Pod and the click is a &lt;code&gt;kubectl delete&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Yes, it&amp;rsquo;s dumb. I like it a lot :).&lt;/p&gt;
&lt;p&gt;Beyond the deliberate trolling (over-engineering from hell) of this game, there&amp;rsquo;s actually an educational objective behind it.&lt;/p&gt;
&lt;p&gt;Really, there is.&lt;/p&gt;
&lt;p&gt;I designed the game as an &lt;strong&gt;introduction to Kubernetes security&lt;/strong&gt;, with difficulty levels to unlock, CTF-style.&lt;/p&gt;
&lt;p&gt;At the beginning, you can do &lt;strong&gt;everything&lt;/strong&gt;. You can of course play normally (delete a Pod, see if there are mines nearby) to get the hang of it. You can also automate actions (a script that &amp;ldquo;clicks a cell&amp;rdquo; at random, retrieves proximity hints for mines, clicks somewhere safe, &amp;hellip;).&lt;/p&gt;
&lt;p&gt;But you can also &lt;strong&gt;cheat&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;And that&amp;rsquo;s the whole point of the game. In the first difficulty levels, the game&amp;rsquo;s Kubernetes manifests are deliberately vulnerable. So you can win effortlessly, if you know where to look.&lt;/p&gt;
&lt;p&gt;However, levels quickly progress and restrictions come with them. Pretty soon, you reach production-grade best practices that prevent any &amp;ldquo;cheating&amp;rdquo;. And if you find unintended vulnerabilities in the game code itself&amp;hellip; well, that&amp;rsquo;s bonus 😈.&lt;/p&gt;
&lt;h2 id="the-initial-process"&gt;The initial process
&lt;/h2&gt;&lt;p&gt;It&amp;rsquo;s probably a mistake, but I did the entire &lt;strong&gt;ideation phase with Gemini&lt;/strong&gt; before launching OpenCode. The experience would probably have been better (at least more representative) if I&amp;rsquo;d started directly with OpenCode.&lt;/p&gt;
&lt;p&gt;I dumped everything I had in mind, along with a big shapeless block of dozens of lines from my notes from when I had the idea in 2023, and asked it to produce a &lt;a class="link" href="https://github.com/zwindler/PodSweeper/blob/main/SPECIFICATION.md" target="_blank" rel="noopener"
&gt;SPECIFICATIONS.md&lt;/a&gt; file with all the important information, put back in order.&lt;/p&gt;
&lt;p&gt;Once the specs were written, I asked it to detail the different levels and the difficulties we would progressively add, in &lt;a class="link" href="https://github.com/zwindler/PodSweeper/blob/main/GAMEPLAY.md" target="_blank" rel="noopener"
&gt;GAMEPLAY.md&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Finally, I asked it to break down the project into &amp;ldquo;issues&amp;rdquo; by priority, so I could get an MVP quickly: &lt;a class="link" href="https://github.com/zwindler/PodSweeper/blob/main/ISSUES_PRIORITY.md" target="_blank" rel="noopener"
&gt;ISSUES_PRIORITY.md&lt;/a&gt;. My idea was to give OpenCode the very detailed, finely sliced battle plan and let it run.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;First disappointment&lt;/strong&gt;: Gemini 3 Pro is pretty bad at this. The tasks were credible(-ish) but ordered randomly (task dependencies not respected). So I launched OpenCode for the first time in my repo and told it &lt;em&gt;&amp;ldquo;here&amp;rsquo;s the context, here&amp;rsquo;s what we came up with for tasks. What do you think?&amp;rdquo;&lt;/em&gt;&lt;/p&gt;
&lt;h2 id="the-good-opencode"&gt;The good (open)code&amp;hellip;
&lt;/h2&gt;&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/02/opencode.avif"
loading="lazy"
&gt;&lt;/p&gt;
&lt;p&gt;Among the pretty nice things: OpenCode + Claude Opus immediately saw that the plan created by Gemini was flawed, and asked me questions and proposed a breakdown that was still imperfect but already more coherent. This is probably where the main value of these tools lies. They embed knowledge to guide software development and above all (ABOVE ALL) bring structure.&lt;/p&gt;
&lt;p&gt;Pretty quickly, we agree on a plan I like. OpenCode updates the documents and starts developing (scaffolding, creating pipelines, first empty binaries and Docker images). We have a project ready to develop in a snap, which is quite exciting, at first glance.&lt;/p&gt;
&lt;p&gt;LLM assistants are very eager to jump into code, this remains true with OpenCode+Opus. Even though we hadn&amp;rsquo;t finished discussing the plan and possible options, it was spontaneously proposing to move to code. That said, it was the same (or worse) with Gemini (to whom I had specifically said we wouldn&amp;rsquo;t be writing any code at all).&lt;/p&gt;
&lt;p&gt;Very quickly, file creations pile up. I struggle to keep up and at each pause (when the LLM has finished a task and asks if it can move to the next), I spend long minutes reading what the LLM produced. It&amp;rsquo;s both exhilarating and exhausting (re-reading all that is hard on my rusty brain).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The raw speed is impressive.&lt;/strong&gt; In 3 sessions of 1 to 2 hours, I have a working MVP. I understand why people regularly talk about 10x engineering when discussing code generation with recent LLMs. If we&amp;rsquo;re not talking about raw code generation (which doesn&amp;rsquo;t really make sense), roughly speaking, functional code is generated 3x to 10x faster than what I could have done alone. Features that end to end would have taken me hours to implement (grid generation, reveal logic, game state management) land in minutes. You&amp;rsquo;ve read it elsewhere, I&amp;rsquo;m saying the same thing — the bottleneck is no longer the code I type: it&amp;rsquo;s the code I have to review.&lt;/p&gt;
&lt;p&gt;From what I&amp;rsquo;ve read, the code is correct, particularly in the first iterations, a bit less so after a while. But when I say &amp;ldquo;correct&amp;rdquo;, I might be underselling it: the code produced is &lt;strong&gt;idiomatic Go&lt;/strong&gt;. Good package structure, naming conventions respected, error handling in Go style (when it&amp;rsquo;s there&amp;hellip;), appropriate use of interfaces. This isn&amp;rsquo;t just code that compiles: it&amp;rsquo;s code a Go reviewer wouldn&amp;rsquo;t reject outright. We&amp;rsquo;re aiming for an MVP so the business logic remains very simple, but the foundation is clean.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Everything is tested&lt;/strong&gt;, very quickly the project has perfect coverage and 100+ tests while we&amp;rsquo;re not even doing anything yet. If this were human-written code, I wouldn&amp;rsquo;t know if that&amp;rsquo;s a good thing. We&amp;rsquo;re absolutely not doing TDD, a lot of code tests useless things. In the case of LLM-generated code, it&amp;rsquo;s probably for the best, because the LLM tends to introduce regressions fairly regularly (we&amp;rsquo;ll come back to that). So it&amp;rsquo;s interesting here, because:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;it costs almost nothing in dev time (it&amp;rsquo;s so fast to generate)&lt;/li&gt;
&lt;li&gt;it allows the LLM to realize its code introduced a regression, and fix it on its own, without waiting.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="-and-the-bad-opencode"&gt;&amp;hellip; and the bad (open)code
&lt;/h2&gt;&lt;p&gt;On the tool and model side first, a few &lt;strong&gt;irritants&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;By default, OpenCode doesn&amp;rsquo;t encourage the LLM to run &lt;code&gt;golangci-lint&lt;/code&gt; on every commit. You fix this with a pre-commit hook and/or the famous AGENTS.md / skills, but it SHOULD have been part of the basic kit for a golang project (tests are there&amp;hellip;).&lt;/li&gt;
&lt;li&gt;The LLM (Claude Opus on OpenCode, but this is a common bias even outside of it) &lt;strong&gt;loves&lt;/strong&gt; software versions that existed at the time of its training. You have to constantly remind it that the versions it suggests are outdated. This is true for everything: Go code, dependencies, GitHub Actions versions.&lt;/li&gt;
&lt;li&gt;You very often hit the 100k token limit. You waste time &amp;ldquo;compacting&amp;rdquo; and lose precision. Typically: the LLM asks if it can &lt;code&gt;git commit&lt;/code&gt; changes, the context compacts before I answer, and it commits without my permission while re-unreeling the context. For code this low-stakes, it&amp;rsquo;s fine. But in prod, that would be a real problem.&lt;/li&gt;
&lt;li&gt;The LLM is often amnesiac: it forgets it has access to a kind cluster for real testing (even though it did it just before the previous compaction, for example).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;On the &lt;strong&gt;code quality&lt;/strong&gt; side, it&amp;rsquo;s more concerning:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Lax error handling.&lt;/strong&gt; Some errors that should have returned a FATAL (controller that can&amp;rsquo;t initialize properly!) are simply logged as ERROR. You end up shipping versions that fail silently. Again, this can probably be fine-tuned with an AGENTS.md.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Race conditions.&lt;/strong&gt; I ran into race conditions fairly quickly, pretty silly ones. In my case, pods stuck in terminating impacted the next game (poor graceful shutdown handling). Opus went into a loop without understanding the root cause. I had to stop it and specify that it should never start a new game without making sure the previous one was finished.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Code generation outside the lines.&lt;/strong&gt; Sometimes, without warning, the LLM adds a feature that wasn&amp;rsquo;t requested, is useless, or even contradicts the previously written SPECIFICATION. You have to slap it back into line.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once put back on track, the LLM rolls again, but these episodes clearly show that for concurrent code or outside of ultra-strict user stories (which goes beyond OpenCode&amp;rsquo;s default workflow), human supervision remains essential.&lt;/p&gt;
&lt;p&gt;People will tell me I&amp;rsquo;m a beginner with these tools, and I could have avoided some pitfalls by better configuring my environment (AGENTS.md, pre-commit hooks, etc.). That&amp;rsquo;s true.&lt;/p&gt;
&lt;p&gt;But that&amp;rsquo;s exactly where the shoe pinches: one of the hyped promises of GenAI applied to development is to enable non-developers (or beginners) to produce quality software at high speed, so they can ship complete software. If getting the most out of it requires being an experienced developer who knows how to configure a complex environment (with the specifics of these AI tools), anticipate race conditions and review concurrent code&amp;hellip; we haven&amp;rsquo;t democratized development, we&amp;rsquo;ve just given another tool to people who already knew how to code. My profile (ops who codes, not a pure dev) was precisely the target audience of this promise. And today, the math doesn&amp;rsquo;t check out.&lt;/p&gt;
&lt;h2 id="where-am-i-at"&gt;Where am I at?
&lt;/h2&gt;&lt;p&gt;I&amp;rsquo;ve been talking about PodSweeper for a while now. But where is this project?&lt;/p&gt;
&lt;p&gt;After 3 sessions of 1 to 2 hours max, not counting ideation, I have a working MVP. As I said earlier, the productivity gain is undeniable, for someone who isn&amp;rsquo;t a &amp;ldquo;developer&amp;rdquo; by trade.&lt;/p&gt;
&lt;p&gt;The code is probably &amp;ldquo;overall&amp;rdquo; better quality than the Go code I could have written, &lt;strong&gt;but&lt;/strong&gt; the LLM lets gaping holes through in the business logic (simply because it doesn&amp;rsquo;t think, while I do. Well, normally). Hard on trust&amp;hellip; You really have to remain very wary of everything it produces.&lt;/p&gt;
&lt;p&gt;The code is available at:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/zwindler/PodSweeper/tree/main" target="_blank" rel="noopener"
&gt;https://github.com/zwindler/PodSweeper/tree/main&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/03/podsweeper.avif"
loading="lazy"
&gt;&lt;/p&gt;
&lt;p&gt;You can go check out the code to form your own opinion. If you&amp;rsquo;re not afraid of spoilers, you can read the &lt;a class="link" href="https://github.com/zwindler/PodSweeper/blob/main/SPECIFICATION.md" target="_blank" rel="noopener"
&gt;SPECIFICATION.md&lt;/a&gt; (spoilers), or even &lt;a class="link" href="https://github.com/zwindler/PodSweeper/blob/main/GAMEPLAY.md" target="_blank" rel="noopener"
&gt;GAMEPLAY.md&lt;/a&gt; (I detail all the levels there so it&amp;rsquo;s mega spoilers).&lt;/p&gt;
&lt;p&gt;You can (normally) play the first difficulty levels. It&amp;rsquo;s still fairly basic, if you&amp;rsquo;ve already used &lt;code&gt;kubectl&lt;/code&gt; you should find the solution very quickly. My goal is to add levels over time, I&amp;rsquo;ve already imagined 10, some quite devious 😈 and others might come later.&lt;/p&gt;
&lt;p&gt;The code is under MPL v2.0 because it&amp;rsquo;s a copyleft license I like, &lt;a class="link" href="https://www.tldrlegal.com/license/mozilla-public-license-2-0-mpl-2" target="_blank" rel="noopener"
&gt;since it&amp;rsquo;s both not very restrictive, OSI compliant and still requires contributing changes back if you make them&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;There you go :) if you also find it fun, feel free to test and/or give me feedback.&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ kubectl apply -k https://github.com/zwindler/podsweeper//deploy/base?ref&lt;span class="o"&gt;=&lt;/span&gt;v0.1.4
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;namespace/podsweeper-game created
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;...
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;$ kubectl &lt;span class="nb"&gt;wait&lt;/span&gt; --for&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;condition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ready pod -l app.kubernetes.io/name&lt;span class="o"&gt;=&lt;/span&gt;podsweeper &lt;span class="se"&gt;\
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; -n podsweeper-game --timeout&lt;span class="o"&gt;=&lt;/span&gt;60s
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;pod/gamemaster-54f4dddcc4-6m88z condition met
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Have fun!&lt;/p&gt;</description></item><item><title>Kyverno killed my API Server. Again.</title><link>https://blog.zwindler.fr/en/2026/02/26/kyverno-killed-my-api-server.-again./</link><pubDate>Thu, 26 Feb 2026 08:00:00 +0200</pubDate><guid>https://blog.zwindler.fr/en/2026/02/26/kyverno-killed-my-api-server.-again./</guid><description>&lt;img src="https://blog.zwindler.fr/2026/02/0_days_without_kyverno.webp" alt="Featured image of post Kyverno killed my API Server. Again." /&gt;&lt;h2 id="the-revenge-strikes-back"&gt;The revenge strikes back
&lt;/h2&gt;&lt;p&gt;You may have read &lt;a class="link" href="https://blog.zwindler.fr/en/2023/11/30/kubernetes-error-etcdserver-mvcc-database-space-exceeded/" &gt;my previous article about etcd 3 years ago&lt;/a&gt;. If you remember correctly, the crash was etcd, but the real culprit was Kyverno. I love Kyverno. It&amp;rsquo;s really a piece of software I&amp;rsquo;m very fond of. Mainly because it&amp;rsquo;s incredibly powerful. I even wrote &lt;a class="link" href="https://blog.zwindler.fr/2022/08/01/vos-politiques-de-conformite-sur-kubernetes-avec-kyverno/" &gt;an introductory article&lt;/a&gt; and &lt;a class="link" href="https://blog.zwindler.fr/2022/09/05/vos-politiques-de-conformite-sur-kubernetes-avec-kyverno-part2/" &gt;a second one that goes deeper&lt;/a&gt; on the topic (both in french, though).&lt;/p&gt;
&lt;p&gt;But the sheer number of incidents and weird side effects it causes. Mamamia&amp;hellip; This isn&amp;rsquo;t the first incident of the year I&amp;rsquo;ve had with Kyverno (yes, we&amp;rsquo;re in February) but since this one is entertaining, I&amp;rsquo;m sharing it with you.&lt;/p&gt;
&lt;p&gt;During a routine maintenance operation to upgrade a Kubernetes cluster to version &lt;strong&gt;1.34&lt;/strong&gt; (from 1.32), we ended up facing the dreaded scenario for any kube admin: a completely unreachable API Server after restarting the Control Plane nodes.&lt;/p&gt;
&lt;p&gt;What initially looked like a typical network error turned out to be a subtle &lt;strong&gt;deadlock&lt;/strong&gt; between new native Kubernetes networking features and our dear Kyverno 😘.&lt;/p&gt;
&lt;p&gt;Spoiler: it wasn&amp;rsquo;t a network issue. It&amp;rsquo;s never a network issue. Well, sometimes it is. But not this time.&lt;/p&gt;
&lt;h2 id="the-upgrade-that-starts-well"&gt;The upgrade that starts well
&lt;/h2&gt;&lt;p&gt;Alright, a Kubernetes upgrade has become pretty routine at this point. We do it regularly, we have our procedures, we&amp;rsquo;re pros (I swear). We jump from 1.32 to 1.34 in a single commit, skipping the hop through 1.33.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;YOLO.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;In the technical context I&amp;rsquo;m talking about, everything is managed as code. From machine provisioning all the way to Talos deployment, including MachineConfigs (the CustomResources to modify&amp;hellip; well, the machine).&lt;/p&gt;
&lt;p&gt;For more details, see &lt;a class="link" href="https://docs.siderolabs.com/talos/v1.12/reference/configuration/v1alpha1/config#machineconfig" target="_blank" rel="noopener"
&gt;the Talos documentation on Machine Configs&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The first cluster we test has only one control plane (don&amp;rsquo;t ask me why, it probably wouldn&amp;rsquo;t have changed anything). Talos restarts the API Server with the new version and then&amp;hellip; nothing.&lt;/p&gt;
&lt;p&gt;The &amp;ldquo;weird&amp;rdquo; API Server logs (technical term) speak for themselves:&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code class="language-log" data-lang="log"&gt;I0224 15:32:50.979280 1 default_servicecidr_controller.go:166] Creating default ServiceCIDR with CIDRs: [10.1.0.0/20]
W0224 15:32:50.984784 1 dispatcher.go:225] rejected by webhook &amp;#34;validate.kyverno.svc-fail&amp;#34;:
admission webhook &amp;#34;validate.kyverno.svc-fail&amp;#34; denied the request:
Get &amp;#34;https://10.1.0.1:443/api&amp;#34;: dial tcp 10.1.0.1:443: connect: operation not permitted
I0224 15:32:50.985342 1 event.go:389] &amp;#34;Event occurred&amp;#34; kind=&amp;#34;ServiceCIDR&amp;#34;
apiVersion=&amp;#34;networking.k8s.io/v1&amp;#34; type=&amp;#34;Warning&amp;#34;
reason=&amp;#34;KubernetesDefaultServiceCIDRError&amp;#34;
message=&amp;#34;The default ServiceCIDR can not be created&amp;#34;
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;😬😬😬&lt;/p&gt;
&lt;h2 id="the-root-cause-a-magnificent-vicious-circle"&gt;The root cause: a magnificent vicious circle
&lt;/h2&gt;&lt;p&gt;After investigation, we discovered that the incident was the result of a collision between a Kubernetes core evolution and our Kyverno configuration. A textbook deadlock case.&lt;/p&gt;
&lt;p&gt;Let&amp;rsquo;s break down the mechanism:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The new &lt;code&gt;ServiceCIDR&lt;/code&gt; Kind&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In recent versions (v1.33+), Kubernetes migrates service IP range management to dedicated objects named &lt;code&gt;ServiceCIDR&lt;/code&gt;. On the first boot after the upgrade, the API Server automatically tries to create the default object (e.g., &lt;code&gt;10.1.0.0/20&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;For the curious, &lt;a class="link" href="https://github.com/kubernetes/enhancements/issues/1880" target="_blank" rel="noopener"
&gt;KEP-1880&lt;/a&gt; and the &lt;a class="link" href="https://kubernetes.io/docs/reference/kubernetes-api/cluster-resources/service-cidr-v1/" target="_blank" rel="noopener"
&gt;official ServiceCIDR documentation&lt;/a&gt; detail this evolution.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s new, it&amp;rsquo;s clean, it&amp;rsquo;s well designed. Except that&amp;hellip;&lt;/p&gt;
&lt;ol start="2"&gt;
&lt;li&gt;Interception by the Kyverno Webhook&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Kyverno, configured with &lt;code&gt;failurePolicy: Fail&lt;/code&gt; (because we&amp;rsquo;re serious people who don&amp;rsquo;t let just anything through in prod), is set up to intercept resource creations to validate them, and &lt;strong&gt;fail the call if Kyverno doesn&amp;rsquo;t respond&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Including the &lt;code&gt;ServiceCIDR&lt;/code&gt; freshly created by the API Server itself.&lt;/p&gt;
&lt;ol start="3"&gt;
&lt;li&gt;Deadlock&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;And this is where it gets beautiful:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The API Server pauses the &lt;code&gt;ServiceCIDR&lt;/code&gt; creation waiting for Kyverno&amp;rsquo;s &amp;ldquo;OK&amp;rdquo;&lt;/li&gt;
&lt;li&gt;To contact the Kyverno service, the API Server needs to route the request through the Kubernetes service IP (typically &lt;code&gt;10.1.0.1&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;But&lt;/strong&gt; the network layer (service routing) can&amp;rsquo;t initialize until the &lt;code&gt;ServiceCIDR&lt;/code&gt; object is validated and created&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It&amp;rsquo;s the chicken and the egg, &amp;ldquo;I locked my keys inside the car&amp;rdquo; edition.&lt;/p&gt;
&lt;p&gt;PTSD. Yes, that actually happened to me. In the desert. With no cell service.&lt;/p&gt;
&lt;ol start="4"&gt;
&lt;li&gt;Profit.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The API Server times out or returns a &lt;code&gt;connect: operation not permitted&lt;/code&gt; error when trying to reach the webhook, blocking its own initialization. CrashLoopBackOff on the API Server. :D&lt;/p&gt;
&lt;h2 id="breaking-out-of-the-deadlock"&gt;Breaking out of the deadlock
&lt;/h2&gt;&lt;p&gt;To escape this deadlock, you need to temporarily bypass the admission layer. Easy, right?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The &amp;ldquo;usual&amp;rdquo; workaround: useless&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Deadlocks with Kyverno, we&amp;rsquo;re used to them at this point. Normally, since &lt;code&gt;kube-system&lt;/code&gt; is ignored, you can simply connect with a break-glass kubeconfig (we normally use OIDC) that has the cluster-admin cluster role and delete the Kyverno validating webhooks:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;kubectl delete validatingwebhookconfiguration kyverno-resource-validating-webhook-cfg
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;&lt;strong&gt;Except here&lt;/strong&gt;, the API Server won&amp;rsquo;t even start. My &lt;code&gt;kubectl&lt;/code&gt; isn&amp;rsquo;t going to work, obviously!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The real workaround: disable webhooks at boot&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The solution we chose was to modify the API Server configuration to temporarily disable validation webhooks at startup. My esteemed colleague Maxime hot-edited the machine config (using break-glass &lt;code&gt;talosctl&lt;/code&gt; access) &lt;a class="link" href="https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#how-do-i-turn-off-an-admission-controller" target="_blank" rel="noopener"
&gt;to add the following flag&lt;/a&gt; directly in the API server&amp;rsquo;s &lt;code&gt;extraArgs&lt;/code&gt;:&lt;/p&gt;
&lt;pre tabindex="0"&gt;&lt;code&gt;--disable-admission-plugins=ValidatingAdmissionWebhook
&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;For those unfamiliar with admission control in Kubernetes, just know that there&amp;rsquo;s a list of &amp;ldquo;default&amp;rdquo; plugins but everything can be toggled off. I might do a deep dive on Kubernetes admission control someday, it&amp;rsquo;s fascinating ;).&lt;/p&gt;
&lt;p&gt;With this flag, the API Server can finally create its &lt;code&gt;ServiceCIDR&lt;/code&gt; objects without asking anyone for permission (completely bypassing all validation mechanisms that Kyverno or similar tools &lt;em&gt;enforce&lt;/em&gt;), the network initializes, Kyverno starts, and then you can remove the flag and restart cleanly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The &amp;ldquo;funny&amp;rdquo; option we didn&amp;rsquo;t try&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Personally, I thought it would be hilarious to go directly into the etcd database and delete the webhook key causing the issue (also through &lt;code&gt;talosctl&lt;/code&gt;):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-bash" data-lang="bash"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# Example via etcdctl&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;etcdctl del /registry/admissionregistration.k8s.io/validatingwebhookconfigurations/kyverno-resource-validating-webhook-cfg
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;My colleagues were less enthusiastic: &amp;ldquo;Yeah but you know, if we break etcd it&amp;rsquo;s gonna be painful&amp;rdquo;. We played it safe with the flag. I&amp;rsquo;m deeply disappointed we didn&amp;rsquo;t try 😂.&lt;/p&gt;
&lt;h2 id="the-permanent-fix-matchconditions"&gt;The permanent fix: MatchConditions
&lt;/h2&gt;&lt;p&gt;OK, now that the cluster is back up, how do we make sure this doesn&amp;rsquo;t happen again on the next upgrade?&lt;/p&gt;
&lt;p&gt;The clean solution is to use &lt;code&gt;matchConditions&lt;/code&gt; (introduced in Kubernetes 1.27) on the &lt;code&gt;ValidatingWebhookConfiguration&lt;/code&gt;. This allows you to exclude critical network bootstrap resources &lt;strong&gt;before&lt;/strong&gt; the request even attempts to leave the API Server toward the Kyverno pod.&lt;/p&gt;
&lt;p&gt;See &lt;a class="link" href="https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#matching-requests-matchconditions" target="_blank" rel="noopener"
&gt;the official documentation on matchConditions&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We were already using this option to throttle the sometimes excessive Kyverno traffic (if you manage Kyverno, you know what I&amp;rsquo;m talking about) on a number of events (we&amp;rsquo;d overwhelm the API server or Kyverno, in CPU or RAM, depending on the case). We just had to add exclusions for the new types:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-yaml" data-lang="yaml"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c"&gt;# Exclude network bootstrap resources to prevent the deadlock&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nt"&gt;matchConditions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;exclude-ServiceCIDR&amp;#39;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;expression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;!(request.kind.kind == &amp;#34;ServiceCIDR&amp;#34;)&amp;#39;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;- &lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;exclude-IPAddress&amp;#39;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;expression&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;!(request.kind.kind == &amp;#34;IPAddress&amp;#34;)&amp;#39;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;With this, when the API Server creates a &lt;code&gt;ServiceCIDR&lt;/code&gt; at boot, the request no longer goes through the Kyverno webhook. No circular dependency, no deadlock, everyone&amp;rsquo;s happy.&lt;/p&gt;
&lt;h2 id="conclusion"&gt;Conclusion
&lt;/h2&gt;&lt;p&gt;As the current French president would say about something that was painfully predictable:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;who could have predicted this?&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;OK fine, all we had to do was read the Kubernetes 1.33 release notes. That said, we have a staging cluster, that&amp;rsquo;s what it&amp;rsquo;s for. We broke staging, no big deal.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/02/bigdeal.avif"
loading="lazy"
&gt;&lt;/p&gt;
&lt;p&gt;Context: this is the &amp;ldquo;Big deal&amp;rdquo; TV Game Mascot&lt;/p&gt;
&lt;p&gt;Maybe we&amp;rsquo;ll actually read them next time?&lt;/p&gt;</description></item><item><title>101 ways to deploy Kubernetes: a brand new UI to explore 118+ solutions</title><link>https://blog.zwindler.fr/en/2026/02/09/101-ways-to-deploy-kubernetes-a-brand-new-ui-to-explore-118-solutions/</link><pubDate>Mon, 09 Feb 2026 18:00:00 +0200</pubDate><guid>https://blog.zwindler.fr/en/2026/02/09/101-ways-to-deploy-kubernetes-a-brand-new-ui-to-explore-118-solutions/</guid><description>&lt;img src="https://blog.zwindler.fr/2026/02/101-kubernetes-ui-screenshot.webp" alt="Featured image of post 101 ways to deploy Kubernetes: a brand new UI to explore 118+ solutions" /&gt;&lt;h2 id="from-google-sheet-to-a-real-web-application"&gt;From Google Sheet to a real web application
&lt;/h2&gt;&lt;p&gt;You might remember my previous posts about this project: first a &lt;a class="link" href="https://blog.zwindler.fr/en/2025/11/02/93-ways-to-deploy-kubernetes-ive-cataloged-almost-all-existing-methods/" target="_blank" rel="noopener"
&gt;simple Google Sheet with 93 methods&lt;/a&gt;, then a &lt;a class="link" href="https://blog.zwindler.fr/en/2026/01/04/101-ways-to-deploy-kubernetes-v2/" target="_blank" rel="noopener"
&gt;GitHub repository with over 100 entries&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Today, I can present the latest iteration of this project: &lt;strong&gt;a real web interface&lt;/strong&gt; to explore all these solutions!&lt;/p&gt;
&lt;p&gt;👉 &lt;a class="link" href="https://zwindler.github.io/101-ways-to-deploy-kubernetes/" target="_blank" rel="noopener"
&gt;101-ways-to-deploy-kubernetes&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/02/101-kubernetes-ui-screenshot.avif"
loading="lazy"
alt="Web interface for the 101 ways to deploy Kubernetes project"
&gt;&lt;/p&gt;
&lt;h2 id="why-a-ui"&gt;Why a UI?
&lt;/h2&gt;&lt;p&gt;The Markdown table on GitHub was already better than the Google Sheet for collaboration, but barely. Hard to parse, hard to add columns without it becoming a mess (it already was, haha), and above all, incredibly UGLY.&lt;/p&gt;
&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/01/101-kubernetes-v2-screenshot.avif"
loading="lazy"
alt="Old Markdown table on GitHub, hard to read"
&gt;&lt;/p&gt;
&lt;p&gt;I therefore decided to turn all of this into a modern and intuitive interface, with the help of an LLM.&lt;/p&gt;
&lt;h2 id="tech-stack-astro--tailwind"&gt;Tech stack: Astro + Tailwind
&lt;/h2&gt;&lt;p&gt;For this project, I chose a simple but effective stack:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class="link" href="https://astro.build/" target="_blank" rel="noopener"
&gt;Astro&lt;/a&gt;&lt;/strong&gt;: a modern framework that generates ultra-fast static sites&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;a class="link" href="https://tailwindcss.com/" target="_blank" rel="noopener"
&gt;Tailwind CSS&lt;/a&gt;&lt;/strong&gt;: for hassle-free responsive design&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The result? A lightweight, fairly fast site that works on both &lt;strong&gt;desktop&lt;/strong&gt; and &lt;strong&gt;mobile&lt;/strong&gt; (though the desktop experience remains more comfortable given the amount of data).&lt;/p&gt;
&lt;h2 id="features"&gt;Features
&lt;/h2&gt;&lt;p&gt;&lt;strong&gt;Cards for each solution&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;No more spreadsheet-style table worthy of a backend dev (no, worse, a kube engineer&amp;hellip;)! Each tool now has its own animated &amp;ldquo;card&amp;rdquo; with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The project logo&lt;/li&gt;
&lt;li&gt;The license type (OSS or proprietary)&lt;/li&gt;
&lt;li&gt;The GitHub star count&lt;/li&gt;
&lt;li&gt;Direct links to the project and third-party resources (independent blogs, experience reports, tutorials&amp;hellip;)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;img src="https://blog.zwindler.fr/2026/02/101-kubernetes-zoom.avif"
loading="lazy"
alt="Detailed view of a Kubernetes solution card with logo, license, and GitHub stars"
&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Powerful filters&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Looking only for open source solutions? Tools for local development? Management platforms?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Category&lt;/strong&gt; filters make navigation easy:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Desktop (local development)&lt;/li&gt;
&lt;li&gt;Managed (cloud offerings)&lt;/li&gt;
&lt;li&gt;Self-hosted (on-premise automation)&lt;/li&gt;
&lt;li&gt;Infra As Code&lt;/li&gt;
&lt;li&gt;Kubernetes OS (specialized operating systems)&lt;/li&gt;
&lt;li&gt;Management Platform&lt;/li&gt;
&lt;li&gt;Kubernetes in Kubernetes&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can also filter by &lt;strong&gt;status&lt;/strong&gt; (active, abandoned) or show only &lt;strong&gt;open source&lt;/strong&gt; or &lt;strong&gt;production ready&lt;/strong&gt; solutions.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;A search bar&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Know what you&amp;rsquo;re looking for? Just type the name in the search bar to find the solution instantly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tags for refinement&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Beyond categories, tags help quickly identify underlying technologies (kubeadm, k3s, k0s&amp;hellip;).&lt;/p&gt;
&lt;h2 id="did-you-know-at-least-18-tools-use-kubeadm"&gt;Did you know? At least 18 tools use kubeadm!
&lt;/h2&gt;&lt;p&gt;While compiling all this data, I discovered something fascinating: &lt;strong&gt;at least 18 tools&lt;/strong&gt; use &lt;code&gt;kubeadm&lt;/code&gt; as a backend to deploy Kubernetes! 🤯&lt;/p&gt;
&lt;p&gt;And that&amp;rsquo;s not even counting the managed Kubernetes offerings from cloud providers!&lt;/p&gt;
&lt;p&gt;This is exactly the kind of information you can now visualize instantly thanks to this new interface.&lt;/p&gt;
&lt;h2 id="a-collaborative-project"&gt;A collaborative project
&lt;/h2&gt;&lt;p&gt;The project remains &lt;strong&gt;100% open source&lt;/strong&gt; and collaborative. The data is still stored in the GitHub repository, and the UI is automatically generated from that data (I even added PR previews).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Is a tool or provider missing?&lt;/strong&gt; Spotted a bug? Feel free to &lt;a class="link" href="https://github.com/zwindler/101-ways-to-deploy-kubernetes/issues" target="_blank" rel="noopener"
&gt;open an issue&lt;/a&gt; or submit a Pull Request!&lt;/p&gt;
&lt;p&gt;The project now lists &lt;strong&gt;118 solutions&lt;/strong&gt; (and I know there are certainly more missing), each with:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Up-to-date links&lt;/li&gt;
&lt;li&gt;Project status&lt;/li&gt;
&lt;li&gt;External references (tutorials, experience reports&amp;hellip;)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="try-it-comment-share"&gt;Try it, comment, share
&lt;/h2&gt;&lt;p&gt;Head over to &lt;a class="link" href="https://zwindler.github.io/101-ways-to-deploy-kubernetes/" target="_blank" rel="noopener"
&gt;zwindler.github.io/101-ways-to-deploy-kubernetes&lt;/a&gt; to explore all these solutions!&lt;/p&gt;
&lt;p&gt;OK, this is a shameless &amp;ldquo;call to action&amp;rdquo; like you see on every social network. Fair enough.&lt;/p&gt;
&lt;p&gt;However, I can&amp;rsquo;t know if this is useful (or not) if you don&amp;rsquo;t tell me. I can leave it as is (it&amp;rsquo;s not a big deal, I have plenty of other projects waiting for my spare time) or keep it alive, if you like it / find it useful.&lt;/p&gt;
&lt;p&gt;And if you do find this project useful, please:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Star&lt;/strong&gt; the project on &lt;a class="link" href="https://github.com/zwindler/101-ways-to-deploy-kubernetes" target="_blank" rel="noopener"
&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Share&lt;/strong&gt; it with your colleagues in the Cloud Native community&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Contribute&lt;/strong&gt; by adding missing tools or fixing errors&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Thanks in advance! 🙏&lt;/p&gt;</description></item></channel></rss>