TL;DR
This is a bit old now, but I still wanted to share a quick write-up on the topic.
Back in January, a cybersecurity researcher reported a Kubernetes flaw that generated quite a buzz. It had been a while since we had a Kube vulnerability that got people talking this much, at least from Denis’s memory (yes, I talk about myself in the third person).
Honestly, it’s pretty wild: the nodes/proxy GET RBAC permission allows any ServiceAccount to execute code inside any Pod in the cluster, without leaving a single trace in the audit logs. That’s unfortunate, especially when you have ServiceAccounts named rook-ceph-system that also happen to have read access to all Secrets in the cluster.
This article details the issue, how to check if you are vulnerable, the fixes to apply, and the preventive measures you can put in place if you can’t patch right away.
The problem: WebSocket + Kubelet = un-audited exec
The vulnerability was documented by Graham Helton in this article. Here is how it works.
The Kubernetes API exposes a nodes/proxy subresource that proxies HTTP requests to each node’s Kubelet. The Kubelet itself exposes an API on port 10250, specifically the /exec endpoint which allows executing commands inside a container.
The issue comes from how the Kubelet handles authorizations for WebSocket connections:
kubectl execuses a WebSocket connection, whose handshake is an HTTPGET- The Kubelet maps this initial
GETto thegetRBAC verb - It checks
nodes/proxy GET, then authorizes the operation - No secondary check is performed for the
CREATEverb normally required for/exec
Result: any ServiceAccount with nodes/proxy GET can execute commands in any Pod in the cluster, including system Pods (etcd, kube-apiserver, etc.).
# Exploitation via websocat
websocat --insecure \
--header "Authorization: Bearer $TOKEN" \
--protocol "v4.channel.k8s.io" \
"wss://$NODE_IP:10250/exec/default/nginx/nginx?output=1&error=1&command=id"
And that’s not all. Commands executed via this method do not generate any Kubernetes audit logs (well, assuming you even collect them 🙈). The access goes directly through the Kubelet, which does not report events back to the API server.
The official Kubernetes status on this: Won’t Fix. It is a “design behavior” (note the quotes), addressed via a feature gate (KEP-2862, see below).
Ouch.
The Audit: Vulnerable ServiceAccounts on our clusters
In January 2026, following the publication of Graham Helton’s article, a lot of people had to urgently audit their clusters. You can either manually audit all your Roles / ClusterRoles, or use a detection script provided by the researcher.
As an example, here are three relatively common components that make great candidates for a juicy privilege escalation:
| Component | ClusterRole | ServiceAccounts |
|---|---|---|
| OpenTelemetry Collector | otel-otelcol-k8sobjects | opentelemetry-collector-daemonset-collector, opentelemetry-collector-deployment-collector |
| OpenTelemetry Operator | otel-operator-resources / opentelemetry-operator-manager | opentelemetry-operator |
| Rook-Ceph | rook-ceph-global, rook-ceph-mgr-cluster | rook-ceph-system, rook-ceph-mgr |
Note: there are many more. Graham Helton added an “Appendix: Affected Helm Charts” section at the end of his article, referencing AT LEAST 69 affected Helm charts according to him.
The critical case: rook-ceph-system
In the official chart, the rook-ceph-system ServiceAccount combined two particularly dangerous permissions:
nodes/proxy GET- the RCEsecrets GET/LIST/WATCHacross the entire cluster
Accessible secrets can include LUKS keys for volume encryption, Ceph admin keyrings, dashboard passwords… this kind of access makes it a prime target for an attacker.
The attack scenario: a compromise of the rook-ceph-operator Pod (via CVE, supply chain, or a malicious image) would allow reading all Ceph secrets, and then executing code in any Pod (including etcd), leading to a full compromise of the cluster and encrypted data.
To manually check if a ServiceAccount is vulnerable:
kubectl auth can-i get nodes --subresource=proxy \
--as=system:serviceaccount:<namespace>:<serviceaccount>
Examples of fixes to apply
Rook-Ceph: Upstream fix
For Rook-Ceph, the fix came from upstream: PR rook/rook#16979 removed nodes/proxy from ClusterRoles. This fix is included in Rook v1.19.1, so updating the affected clusters was enough.
After updating, checking across all clusters:
kubectl get clusterroles rook-ceph-global -o yaml | grep -A3 nodes/proxy
# -> nothing
OTel / OTel operator
For OpenTelemetry, the situation is potentially more complex. If you are using otel-operator and OtelCollector Custom Resources, you likely have to manage your own RBAC manifests yourself.
Having had to do it myself, it’s quite painful. Depending on your collector type and the receivers you enabled, you need to cross-reference multiple documents on the official OTel and otel-operator websites.
Upstream merged a conditional approach in open-telemetry/opentelemetry-helm-charts#2083 based on the Kubernetes version:
# Upstream approach (opentelemetry-helm-charts#2083)
{{- if semverCompare ">=1.33-0" .Capabilities.KubeVersion.Version }}
- nodes/pods
{{- else }}
- nodes/proxy
{{- end }}
Here again, if all your clusters are up to date, you can simply replace nodes/proxy with nodes/pods directly (without the condition).
otel-collector-crb.yaml:
# Before
rules:
- apiGroups: [""]
resources:
- nodes
- nodes/proxy # <- RCE risk
- nodes/spec
- nodes/stats
verbs:
- get
# After
rules:
- apiGroups: [""]
resources:
- nodes
# nodes/pods replaces nodes/proxy (RCE risk, see [https://grahamhelton.com/blog/nodes-proxy-rce](https://grahamhelton.com/blog/nodes-proxy-rce))
# Requires K8s >= 1.33 (KEP-2862 fine-grained kubelet authz)
- nodes/pods
- nodes/spec
- nodes/stats
verbs:
- get
otel-operator-rbac.yaml:
# Before
- apiGroups: [""]
resources:
- nodes/proxy # <- RCE risk
verbs:
- get
# After
# nodes/pods replaces nodes/proxy (RCE risk, see [https://grahamhelton.com/blog/nodes-proxy-rce](https://grahamhelton.com/blog/nodes-proxy-rce))
# Requires K8s >= 1.33 (KEP-2862 fine-grained kubelet authz)
- apiGroups: [""]
resources:
- nodes/pods
verbs:
- get
Preventive Measures
KEP-2862: Fine-Grained Kubelet API Authorization
As teased earlier, the real long-term solution is KEP-2862 (Fine-Grained Kubelet API Authorization). It introduces granular subresources (nodes/pods, nodes/metrics, nodes/stats, nodes/log, etc.) allowing precise access without using nodes/proxy.
| K8s Version | KEP-2862 Status |
|---|---|
| 1.32 | Alpha |
| 1.33 | Beta, enabled by default - nodes/proxy GET no longer grants access to /exec |
| 1.36 | GA (locked to enabled) |
But this requires going through EVERY chart in use and checking all deployed manifests now and in the future.
CiliumNetworkPolicy: Blocking the Kubelet port
While waiting for the K8s upgrade, or as defense-in-depth, you can block access to port 10250 from the affected pods using NetworkPolicies (or CiliumNetworkPolicy if you use Cilium as your CNI plugin).
Warning: This only applies to components that do not need to access the Kubelet. The OTel collector potentially needs it to gather Kubelet metrics. In that case, you have no choice but to fix the RBAC.
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
name: deny-kubelet-api-access
namespace: <namespace>
spec:
endpointSelector:
matchLabels:
<app-label>: <value>
egressDeny:
- toEntities:
- host
- remote-node
toPorts:
- ports:
- port: "10250"
protocol: TCP
Kyverno: Blocking the creation of new Roles with nodes/proxy
To prevent any regression (remember, we need to protect ourselves in the future), we can add a Kyverno ClusterPolicy that rejects the creation or modification of a ClusterRole or Role containing nodes/proxy.
Luckily, there are ready-to-use examples on the official Kyverno website:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: restrict-nodes-proxy
spec:
validationFailureAction: Audit # switch to Enforce after validation
background: true
rules:
- name: deny-nodes-proxy-in-clusterroles
match:
any:
- resources:
kinds:
- ClusterRole
- Role
exclude:
any:
- resources:
names:
- "system:kubelet-api-admin" # built-in K8s, unmodifiable
validate:
message: >
nodes/proxy grants RCE capability via Kubelet WebSocket exec.
Use nodes/pods (requires K8s >= 1.33, KEP-2862) instead.
deny:
conditions:
any:
- key: "nodes/proxy"
operator: AnyIn
value: "{{ request.object.rules[].resources[] }}"
Deployment is done in two stages: first in Audit mode to ensure there are no remaining vulnerable manifests (which you should fix before blocking), then in Enforce mode to actually block them.
Monitoring the Audit Log
Even if commands executed via the Kubelet leave no trace, we can monitor SubjectAccessReviews to detect attempts at enumerating nodes/proxy permissions.
The configuration in the Kubernetes audit policy:
# audit-policy.yaml
- level: Request
verbs: ["create"]
resources:
- group: "authorization.k8s.io"
resources: ["subjectaccessreviews"]
Then a Prometheus/Alertmanager alert on SARs related to nodes/proxy:
# Detect SARs targeting nodes/proxy
increase(
apiserver_audit_event_total{
verb="create",
resource="subjectaccessreviews"
}[5m]
) > 0
References
- Kubernetes RCE via nodes/proxy GET - Graham Helton
- Detection Script
- KEP-2862 Fine-Grained Kubelet API Authorization
- Kubernetes RBAC Good Practices - nodes/proxy
- rook/rook#16979 - Fix upstream Rook-Ceph
- open-telemetry/opentelemetry-helm-charts#2083 - Fix upstream OTel
