Cluster Configuration

GKE

Private Clusters

If you are using a private GKE cluster, you are required to create a firewall rule that allows the GKE operated api-server to communicate with the Linkerd control plane. This makes it possible for features such as automatic proxy injection to receive requests directly from the api-server.

In this example, we will use gcloud to simplify the creation of the said firewall rule.

Setup:

CLUSTER_NAME=your-cluster-name
gcloud config set compute/zone your-zone-or-region

Get the cluster MASTER_IPV4_CIDR:

MASTER_IPV4_CIDR=$(gcloud container clusters describe $CLUSTER_NAME \
  | grep "masterIpv4CidrBlock: " \
  | awk '{print $2}')

Get the cluster NETWORK:

NETWORK=$(gcloud container clusters describe $CLUSTER_NAME \
  | grep "^network: " \
  | awk '{print $2}')

Get the cluster auto-generated NETWORK_TARGET_TAG:

NETWORK_TARGET_TAG=$(gcloud compute firewall-rules list \
  --filter network=$NETWORK --format json \
  | jq ".[] | select(.name | contains(\"$CLUSTER_NAME\"))" \
  | jq -r '.targetTags[0]' | head -1)

The format of the network tag should be something like gke-cluster-name-xxxx-node.

Verify the values:

echo $MASTER_IPV4_CIDR $NETWORK $NETWORK_TARGET_TAG

# example output
10.0.0.0/28 foo-network gke-foo-cluster-c1ecba83-node

Create the firewall rules for proxy-injector, policy-validator and tap:

gcloud compute firewall-rules create gke-to-linkerd-control-plane \
  --network "$NETWORK" \
  --allow "tcp:8443,tcp:8089,tcp:9443" \
  --source-ranges "$MASTER_IPV4_CIDR" \
  --target-tags "$NETWORK_TARGET_TAG" \
  --priority 1000 \
  --description "Allow traffic on ports 8443, 8089, 9443 for linkerd control-plane components"

Finally, verify that the firewall is created:

gcloud compute firewall-rules describe gke-to-linkerd-control-plane

Lifecycle Hook Timeout

Linkerd uses a postStart lifecycle hook for all control plane components, and all injected workloads by default. The hook will poll proxy readiness through linkerd-await and block the main container from starting until the proxy is ready to handle traffic. By default, the hook will time-out in 2 minutes.

CNI plugins that are responsible for setting up and enforcing NetworkPolicy resources can interfere with the lifecycle hook’s execution. While lifecycle hooks are running, the container will not reach a Running state. Some CNI plugin implementations acquire the Pod’s IP address only after all containers have reached a running state, and the kubelet has updated the Pod’s status through the API Server. Without access to the Pod’s IP, the CNI plugins will not operate correctly. This in turn will block the proxy from being set-up, since it does not have the necessary network connectivity.

As a workaround, users can manually remove the postStart lifecycle hook from control plane components. For injected workloads, users may opt out of the lifecycle hook through the root-level await: false option, or alternatively, behavior can be overridden at a workload or namespace level through the annotation config.linkerd.io/proxy-await: disabled. Removing the hook will allow containers to start asynchronously, unblocking network connectivity once the CNI plugin receives the pod’s IP.