CiliumでGKEコンテナネットワークを可視化する

CNI pluginの一つであるCiliumをGKEで構築したk8sクラスタにインストールして、お手軽コンテナネットワーク可視化を試したいと思います。

Cilium

eBPF-based Networking, Observability, and Securityと銘打ってます。eBPFを利用すればLinux system callをフックして任意のコードを実行することができると言うことです。これを利用してObservability/Securityを提供するソリューションがCiliumです。 eBPFに関してダイブしたい人はこちらにとても詳しく書いてありますので読むことをお勧めします。

CNI Chaining

GKEのk8sクラスタはネットワークポリシーを有効にするとCNIとしてCalicoが選択されます。Ciliumを利用する方法は2つあります。

  1. CNIをCiliumに変更する
  2. CNI ChainingでIP管理などはCalicoに任せる

CiliumにはGKEクラスタのノード設定を変更する処理を実行するオプションが用意されていますが、今回はノード設定を変更したくないのでCNI Chainingを採用することにしました。

Install

こちらを参考にしてインストールしていきます。

用意したクラスタに次のmanifestを投入します。

$ cat chaining.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cni-configuration
  namespace: kube-system
data:
  cni-config: |-
    {
      "name": "generic-veth",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "calico",
          "log_level": "debug",
          "datastore_type": "kubernetes",
          "mtu": 1460,
          "ipam": {
              "type": "host-local",
              "subnet": "usePodCidr"
          },
          "policy": {
              "type": "k8s"
          },
          "kubernetes": {
              "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
          }
        },
        {
          "type": "portmap",
          "snat": true,
          "capabilities": {"portMappings": true}
        },
        {
          "type": "cilium-cni"
        }
      ]
    }

$ kubectl apply -f chaining.yaml
configmap/cni-configuration created

MTUとipamの設定が公式と異なりますが、GKEのCalicoの設定に合わせたものです。

Helm(Version 3)でインストール!

$ helm install cilium cilium/cilium --version 1.8.2 \
  --namespace=kube-system -f deploy-cilium-with-chaining.yaml

NAME: cilium
LAST DEPLOYED: Mon Aug 24 13:58:16 2020
NAMESPACE: kube-system
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
You have successfully installed Cilium.

Your release version is 1.8.2.

For any further help, visit https://docs.cilium.io/en/v1.8/gettinghelp

Visualize InterfaceのHubbleを有効にする

$ ./enable-hubble.sh
Release "cilium" has been upgraded. Happy Helming!
NAME: cilium
LAST DEPLOYED: Mon Aug 24 14:03:28 2020
NAMESPACE: kube-system
STATUS: deployed
REVISION: 2
TEST SUITE: None
NOTES:
You have successfully installed Cilium with Hubble Relay and Hubble UI.

Your release version is 1.8.2.

For any further help, visit https://docs.cilium.io/en/v1.8/gettinghelp

Hubble UIをポートフォワードして

$ kubectl -n kube-system port-forward svc/hubble-ui 12000:80
Forwarding from 127.0.0.1:12000 -> 12000
Forwarding from [::1]:12000 -> 12000
Handling connection for 12000

ブラウザで見てみると…

hubble installed

あれ、flowが見えない

おもむろにCiliumを再起動

$ kubectl -n kube-system rollout restart daemonset cilium
daemonset.apps/cilium restarted

おおー

Container network flow on hubble

でもまだCiliumに管理されていないpodは可視化できない…

2番目のNote参照

The new CNI chaining configuration will not apply to any pod that is already running the cluster. Existing pods will be reachable and Cilium will load-balance to them but policy enforcement will not apply to them and load-balancing is not performed for traffic originating from existing pods.

You must restart these pods in order to invoke the chaining configuration on them.

なるほど、では再起動じゃ

$ kubectl get pods -n kube-system -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,HOSTNETWORK:.spec.hostNetwork --no-headers=true | grep '<none>' | awk '{print "-n "$1" "$2}' | xargs -L 1 kubectl delete pod
pod "calico-node-vertical-autoscaler-bcc7978d-cxm2z" deleted
pod "calico-typha-horizontal-autoscaler-7cd7856b7b-jwwts" deleted
pod "calico-typha-vertical-autoscaler-7c4b89c9-9ftfh" deleted
pod "event-exporter-gke-59b99fdd9c-qh4qk" deleted
pod "fluentd-gke-scaler-cd4d654d7-jd8fl" deleted
pod "hubble-relay-f96dd575f-h6d8b" deleted
pod "hubble-ui-5ddff94674-nb9tf" deleted
pod "kube-dns-7c976ddbdb-f7lbg" deleted
pod "kube-dns-7c976ddbdb-pd5t9" deleted
pod "kube-dns-autoscaler-645f7d66cf-wq6rz" deleted
pod "l7-default-backend-678889f899-sww5r" deleted
pod "metrics-server-v0.3.6-7b7d6c7576-dcvpp" deleted
pod "stackdriver-metadata-agent-cluster-level-6868b6c756-vmdfr" deleted

5分ほどかかりました。もちろん既存のpodが動いている環境であればそのpodを再起動する必要がありますので、稼働中のサービスに適用する際は自己責任でお願いします。

キタコレ

All containers network flow on hubble

Kialiっぽいですが、機能がシンプルだな… ただし、metricsのexportが強力!prometheusと連携すればグリグリVisualizeできそうです。(今回はやりません)

hubble CLIを使ってみる。

ポートフォワードして

$ kubectl -n kube-system port-forward svc/hubble-relay 4245:80
Forwarding from 127.0.0.1:4245 -> 4245
Forwarding from [::1]:4245 -> 4245

observe!

$ ./hubble observe -f --server localhost:4245
Aug 24 05:33:01.976 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK)
Aug 24 05:33:31.976 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:33:31.980 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:34:01.975 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:34:01.976 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK)
Aug 24 05:34:31.975 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:34:31.980 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:35:01.975 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:35:01.980 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:35:31.977 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:35:31.980 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:36:01.975 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:36:01.980 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:36:31.975 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:36:31.979 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:37:01.975 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:37:01.980 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:37:31.975 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:37:31.979 [gke-cilium-cilium-e61da247-snqw]: 35.243.71.11:443 -> kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:38:01.975 [gke-cilium-cilium-e61da247-snqw]: kube-system/calico-node-vertical-autoscaler-bcc7978d-5vspj:47694 -> 35.243.71.11:443 to-stack FORWARDED (TCP Flags: ACK, PSH)
Aug 24 05:38:13.273 [gke-cilium-cilium-0908de5c-fcpx]: kube-system/kube-dns-7c976ddbdb-cmpgb:10054 -> 10.61.0.73:52640 to-stack FORWARDED (TCP Flags: ACK, FIN)

ふむ。

Network Policy

さてコンテナネットワークの可視化が簡単にできることはわかりました。Network Policyは本家のexamplesが充実しているのでいろいろ試してみるが良い。

PS.

将来的にGKEネイティブになりそうだと昨日知ったw

New GKE Dataplane V2 increases security and visibility for containers