Deploy traefik in GKE using (L7) Google HTTP/S Load balancer

Monday, February 15, 2021

Using Google Load Balancers in GKE

Google Cloud proposes multiple types of load balancers and we can use those in GKE.

Usually, the first thing we do in GKE when we need to expose a service is to set its type to type: LoadBalancer. Under the hood, GCP automatically creates a Regional External TCP/UDP Network Load Balancer (L4) that gives us an IP address in Kubernetes to reach our service.

This LoadBalancer is flexible as we can expose TCP services such as ssh (used with git for example) or jnlp ports for Jenkins agents. One downside however is that it is regional and that can cause latency if the service is to be accessed from a distant region.

To deal with global latency, I think a good solution is to use a L7 Google External HTTP(S) Load Balancer because, according to Google, it uses Google frontends that are distributed globally and operate together using Google’s global network and control plane. Of course this load balancer works only with HTTP/S, so we can’t use it to reach a ssh server, but that covers most of the use cases exposing web servers (web sockets are also supported).

The Google cloud documentation provides some benchmarks from Germany to us-central1 with the following results:

Option	Ping	TTFB
No load balancing	110 ms to the web server	230 ms
Network Load Balancing	110 ms to the in-region network load balancer	230 ms
HTTP(S) Load Balancing	1 ms to the closest GFE	123 ms

That’s definitely an improvement!

There are also interesting features available with the Google L7 load balancer such as the integrations with Cloud CDN and the Identity-Aware Proxy. Cloud CDN allows to cache HTTP content close to users, this is particularly useful when the web app we are hosting includes heavy javascript libs to be downloaded by the browser. The Identity-Aware Proxy allows to restrict HTTP/S access to a list of users/groups that have a google account.

To use a HTTP load balancer instead of a network load balancer, we need to expose a service through google ingress and set type: NodePort in the service object (more details in here).

Choosing an ingress controller

The way we indirectly use those load balancers in Kubernetes can have an impact on the costs.

For example, the L4 load balancer is the default kind of load balancer used by the Voyager ingress controller which, by design, uses a load balancer per ingress (more details in here). Say you have 10 services you need to expose, you will use 10 ingresses and GCP will create 10 L4 load balancers.

Regarding the GCP pricing, there is a flat rate for the first 5 forwarding rules (billed separately for regional and global forwarding rules) and additional rules are billed on top of that. That means we should optimize our setup to use a minimum amount of load balancers, ideally up to 5.

We could think about stacking everything in a single ingress to have a only one load balancer in GCP and minimize costs doing so, but, in my ideal world, I would like to manage decoupled deployments so I don’t want to edit a central ingress with all the “path-service” mappings.

We can address those concerns using Traefik. Indeed, we can expose Traefik behind a L7 load balancer and use its CRDs to specify the ingress routes. We’ll see next how it looks like.

Deploying Traefik

The traefik helm chart can be used to deploy it in a GKE cluster, however, some modifications need to be made to have a proper setup using a Google HTTP Load Balancer.

First, we need to modify the pod template file to override the liveness/readiness probes port because Google automatically creates healthchecks based on those probes ports (Hopefully this will be available by default in the official helm chart).

diff --git a/traefik/templates/_podtemplate.tpl b/traefik/templates/_podtemplate.tpl
index f401d26..3d5ddc5 100644
--- a/traefik/templates/_podtemplate.tpl
+++ b/traefik/templates/_podtemplate.tpl
@@ -38,7 +38,7 @@
         readinessProbe:
           httpGet:
             path: /ping
-            port: {{ .Values.ports.traefik.port }}
+            port: {{ default .Values.ports.traefik.port .Values.ports.traefik.healthchecksPort }}
           failureThreshold: 1
           initialDelaySeconds: 10
           periodSeconds: 10
@@ -47,7 +47,7 @@
         livenessProbe:
           httpGet:
             path: /ping
-            port: {{ .Values.ports.traefik.port }}
+            port: {{ default .Values.ports.traefik.port .Values.ports.traefik.healthchecksPort }}
           failureThreshold: 3
           initialDelaySeconds: 10
           periodSeconds: 10

Then we need to modify the defaults helm values:

The .service.type value is set to NodePort.
We override the ping configuration and helthchecksPort
We can also disable the websecure entrypoint, SSL termination can be done at the Load balancer level.

This looks like the following:

diff --git a/values.yaml b/values.yaml
index 1e0e5a9..785a1d3 100644
--- a/values.yaml
+++ b/values.yaml
@@ -183,9 +183,10 @@ globalArguments:
 # Additional arguments to be passed at Traefik's binary
 # All available options available on https://docs.traefik.io/reference/static-configuration/cli/
 ## Use curly braces to pass values: `helm install --set="additionalArguments={--providers.kubernetesingress.ingressclass=traefik-internal,--log.level=DEBUG}"`
-additionalArguments: []
-#  - "--providers.kubernetesingress.ingressclass=traefik-internal"
-#  - "--log.level=DEBUG"
+additionalArguments:
+  - "--providers.kubernetesingress.ingressclass=traefik"
+  - "--ping"
+  - "--ping.entrypoint=web"

 # Environment variables to be passed to Traefik's binary
 env: []
@@ -214,6 +215,11 @@ ports:
   # liveness probes, but you can adjust its config to your liking
   traefik:
     port: 9000
+
+    # Override the liveness/readiness port. This is useful to integrate traefik
+    # with an external Load Balancer that performs healthchecks.
+    healthchecksPort: 8000
+
     # Use hostPort if set.
     # hostPort: 9000
     #
@@ -251,7 +257,7 @@ ports:
   websecure:
     port: 8443
     # hostPort: 8443
-    expose: true
+    expose: false
     exposedPort: 443
     # The port protocol (TCP/UDP)
     protocol: TCP
@@ -286,7 +292,7 @@ tlsOptions: {}
 # from.
 service:
   enabled: true
-  type: LoadBalancer
+  type: NodePort
   # Additional annotations (e.g. for cloud provider specific config)
   annotations: {}
   # Additional service labels (e.g. for filtering Service by custom labels)

Finally, we can add, for example, the following helm template to create a gce ingress for the HTTP Load Balancer.

{{- if and .Values.service.enabled (eq .Values.service.type "NodePort") }}
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: traefik
  annotations:
    kubernetes.io/ingress.class: "gce"
    external-dns.alpha.kubernetes.io/hostname: {{ .Values.hostname }}
    kubernetes.io/ingress.allow-http: "false"
spec:
  backend:
    serviceName: {{ template "traefik.fullname" . }}
    servicePort: {{ .Values.ports.web.exposedPort }}
  tls:
    - secretName: {{ .Values.certificateSecret }}
{{- end }}

.Values.hostname could be traefik.example.com
.Values.certificateSecret is the secret name containing the TLS certificate generated, for example, with cert-manager.

Using Traefik in other deployments

How does this traefik setup look like in pratice?

First, let’s create a whoami deployment.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: whoami
  labels:
    app: whoami
spec:
  replicas: 1
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
        - image: containous/whoami:v1.5.0
          name: whoami
          ports:
            - name: web
              containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: whoami
spec:
  ports:
    - protocol: TCP
      name: web
      port: 80
  selector:
    app: whoami

Then, using external-dns, we can create a CNAME record that points to traefik and a traefik ingress route to redirect to the whoami service.

apiVersion: externaldns.k8s.io/v1alpha1
kind: DNSEndpoint
metadata:
  name: whoami
spec:
  endpoints:
  - dnsName: whoami.example.com
    recordTTL: 300
    recordType: CNAME
    targets:
    - traefik.example.com
---
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: whoami
spec:
  routes:
  - kind: Rule
    match: Host(`whoami.example.com`)
    services:
    - name: whoami
      port: 80

So far so good, let’s try it out with a curl https://whoami.example.com

We are supposed to get the following result:

Hostname: whoami-0
IP: 127.0.0.1
IP: 10.40.1.16                # whoami pod IP
RemoteAddr: 10.40.1.13:47784  # traefik pod IP
GET / HTTP/1.1
Host: whoami.example.com
User-Agent: curl/7.68.0
Accept: */*
Accept-Encoding: gzip
Via: 1.1 google
X-Client-Data: CgSL6ZsV
X-Cloud-Trace-Context: 1c5d7...
X-Forwarded-For: 10.40.1.1
X-Forwarded-Host: whoami.example.com
X-Forwarded-Port: 80
X-Forwarded-Proto: http
X-Forwarded-Server: traefik-6d6fcbd876-58nvn
X-Real-Ip: 10.40.1.1

We now have a decoupled setup to expose services in kubernetes deployments, using a single Google Cloud Global HTTP Load balancer through traefik. Yay!