Why
Why would you want to do this?
It makes self-hosting Kubernetes dead-simple. No need to have Kubernetes nodes listen on standard HTTP/HTTPS ports for your Gateway API, or using wonky iptables configs to shuffle the traffic around. Cilium intercept the packets destined for the Virtual IPs (VIPs) at the kernel level and moves them directly into the Kubernetes network.
Run it on a single node, or add as many nodes as you like. Traffic is load-balanced to all nodes, as Kubernetes traffic should be. I’m using a DaemonSet for my Nginx Gateway API, so all traffic that hits the nodes goes directly through the Gateway API to the Kubernetes servies. No need to worry about traffic hitting one node, going to another node for entrance to the Gateway API, then another node for the service/pod.
System Requirements
Cilium can be heavy. This is a basic config, so the requirements to start are very minimal. This is a control plane node:
NAMESPACE NAME CPU(cores) MEMORY(bytes)
default echo-68cc87c476-8pfn8 0m 2Mi
kube-system cilium-envoy-hfw2t 2m 19Mi
kube-system cilium-operator-5d4cfd5575-ch4rl 3m 47Mi
kube-system cilium-zf8hr 19m 128Mi
kube-system coredns-db7c7cbf8-p8zj9 1m 23Mi
kube-system dns-autoscaler-86bc484f5d-6hwm8 1m 17Mi
kube-system kube-apiserver-kube-node-01 65m 598Mi
kube-system kube-controller-manager-kube-node-01 11m 88Mi
kube-system kube-scheduler-kube-node-01 7m 35Mi
kube-system metrics-server-55bf4495db-fkmm8 3m 20Mi
kube-system nodelocaldns-5lxwb 2m 18Mi
nginx-gateway ngf-nginx-gateway-fabric-8598d4c8c8-md954 3m 33Mi
OPNSense Config
Install the os-frr plugin (The FRRouting Protocol Suite)
Routing > BGP > General
- Enable BGP
- Set private BGP AS Number for link between OPNSense and Cluster (I used 64512)
- Apply
Routing > BGP > Neighbors
- New Neighbor
- Enable Neighbor
- Set Peer-IP to IP address of Kubernetes Node
- Set Remote AS to private AS of Cluster VIP Pool (I used 64513)
- Repeat for all nodes
- Apply
Kubespray Setup w/ Cilium BGP Peering to OPNSense
Clone the repo:
git clone https://github.com/kubernetes-sigs/kubespray/
Setup your inventory, and configure kubespray (k8s-cluster.yml):
kube_owner: root # Required for Cilium, as explained in Kubespray docs
kube_network_plugin: cilium
kube_proxy_remove: true
# Cilium takes over all kube-proxy duties using eBPF
cilium_kube_proxy_replacement: true
Kubespray Cilium config (k8s-net-cilium.yml):
cilium_enable_bgp_control_plane: true
# -- Configure Loadbalancer IP Pools
cilium_loadbalancer_ip_pools:
- name: "ip-pool"
cidrs:
- "10.17.50.0/28"
# -- Configure BGP Instances
cilium_bgp_cluster_configs:
- name: "opnsense-bgp"
spec:
bgpInstances:
- name: "opnsense-instance"
localASN: 64513
peers:
- name: "opnsense-router"
peerASN: 64512
peerAddress: '10.17.40.1'
peerConfigRef:
name: "opnsense-peer"
# -- Configure BGP Advertisements
cilium_bgp_advertisements:
- name: opnsense-advertisements
labels:
advertise: "bgp-services"
spec:
advertisements:
- advertisementType: "Service"
service:
addresses:
- LoadBalancerIP
Post Kubespray Cilium Config
kubectl apply -f
apiVersion: "cilium.io/v2"
kind: CiliumBGPClusterConfig
metadata:
name: opnsense-bgp
spec:
nodeSelector:
matchLabels:
kubernetes.io/os: "linux"
bgpInstances:
- name: "opnsense-instance"
localASN: 64513
peers:
- name: "opnsense-router"
peerASN: 64512
peerAddress: "10.17.40.1" # OPNsense router's LAN IP
peerConfigRef:
name: "opnsense-peer-config"
# Define the connection type
apiVersion: "cilium.io/v2"
kind: CiliumBGPPeerConfig
metadata:
name: opnsense-peer-config
spec:
families:
- afi: ipv4
safi: unicast
advertisements:
matchLabels:
advertise: "bgp-services"
# Tell Cilium to announce LoadBalancer IPs
apiVersion: "cilium.io/v2"
kind: CiliumBGPAdvertisement
metadata:
name: opnsense-advertisement
labels:
advertise: "bgp-services"
spec:
advertisements:
- advertisementType: Service
service:
addresses:
- LoadBalancerIP
selector:
matchExpressions:
- key: somekey
operator: NotIn
values:
- never-match
Confirm BGP session established:
❯ kubectl exec -n kube-system ds/cilium -c cilium-agent -- cilium bgp peers
Local AS Peer AS Peer Address Session Uptime Family Received Advertised
64513 64512 10.17.1.1:179 established 6m21s ipv4/unicast 0 0
Nginx Gateway API
Nginx Gateway API config:
nginx:
kind: daemonSet # Run on all nodes. Use Deployments and taints w/ Seperate control plane nodes.
service:
create: true
type: LoadBalancer
externalTrafficPolicy: Local
# Tells Cilium BGP to announce the VIP
# Bypasses internal cluster routing to preserve the original client IP address
# Ensures the fastest possible path from the NIC to the Web Server
helm install ngf oci://ghcr.io/nginxinc/charts/nginx-gateway-fabric \
-n nginx-gateway --create-namespace \
-f nginx-ingress-values.yaml
Application Pod (Deployment)
apiVersion: apps/v1
kind: Deployment
metadata:
name: echo
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: echo
template:
metadata:
labels:
app: echo
spec:
containers:
- name: echo
image: jmalloc/echo-server:latest
ports:
- containerPort: 8080
Expose the Pod to the cluster
apiVersion: v1
kind: Service
metadata:
name: echo
namespace: default
spec:
selector:
app: echo
ports:
- port: 8080
targetPort: 8080
Tell NGINX Gateway Fabric to open Port 80
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: main-gateway
namespace: default
spec:
gatewayClassName: nginx
listeners:
- name: http
port: 80
protocol: HTTP
allowedRoutes:
namespaces:
from: All
Connect the Gateway to the Service
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: echo-route
namespace: default
spec:
parentRefs:
- name: main-gateway
hostnames:
- "echo.local" # Domain for testing
rules:
- matches:
- path:
type: PathPrefix
value: /
backendRefs:
- name: echo
port: 8080
curl -H "Host: echo.local" http://10.17.50.0/echo
Request served by echo-68cc87c476-8pfn8
GET /echo HTTP/1.1
Host: echo.local
Accept: */*
Connection: close
User-Agent: curl/8.20.0
X-Forwarded-For: 10.17.110.2
X-Forwarded-Host: echo.local
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Real-Ip: 10.17.110.2