Istio Ingress Gateway with F5 BIG-IP Load Balancer as TLS Bridge
Context
We have recently introduced the Istio service mesh in our Kubernetes clusters to take advantage of its traffic management, security and observability features. One of our key requirements is to enforce TLS/encryption on traffic in transit across all layers. Within the cluster, this is done via mTLS provided by Istio. For external traffic as well as traffic at the service mesh edge, we use the F5 BIG-IP appliance, which not only acts as the load balancer for external ingress into the cluster, but also provides the TLS bridge between external traffic and the Istio ingress gateway.
The F5 provides SSL bridging by performing TLS termination for external inbound traffic, and then re-encrypting the traffic before sending it to the Istio ingress gateway in the k8s cluster. This allows the F5 to decrypt and perform deep packet inspection of inbound traffic, while still allowing us to keep traffic encrypted end-to-end.
Problem
When we first deployed this architecture, we noticed that the client requests were getting an SSL connection error. Initially, we suspected one of the following possibilities:
- Incorrect certificate configuration on the F5.
- F5 virtual server pool not configured properly for the backend istio-ingressgateway pods
- Istio gateway resource configuration misconfigured, causing improper routing of traffic.
- Missing or incorrect certificate for the backend service.
However, none of these initial theories panned out. We then looked at the logs from the istio-ingressgateway pods, and realized that the issue was due to the fact that the istio-ingressgateway service was not able to associate the incoming traffic from the F5 with any of the configured hostnames in the Istio gateway resource configuration, resulting in an incomplete TLS handshake between the F5 and itself. Here’s what was happening:
- Client sends an HTTPS request to service-a.company.com. The URL resolves to an F5 virtual server. F5 serves the certificate for “service-a.company.com”. Client performs TLS handshake with F5.
- F5 initiates TLS handshake with one of the istio-ingressgateway pods in the virtual server pool. F5 sends HTTPS request to 10.230.197.21, with the host header set to “service-a.company.com”.
- Since the host header is encrypted, the istio-ingressgateway service fails to associate the request with the “service-a.company.com” gateway configuration, thus failing to serve the selfesigned-credential cert. The TLS handshake fails.
- F5 responds to the client with the SSL error.
Solution
The root cause of the issue was that the istio-ingressgateway could not properly associate the incoming request with the hostname “service-a.company.com”, since the F5 addresses it via an IP address, putting the originally requested hostname “service-a.company.com” in the encrypted host header.
Fortunately, the TLS protocol has an extension called SNI (Server Name Indication), which allows the client to indicate to the server which hostname it is trying to access at the beginning of the handshake. This allows the client to indicate a hostname different from the URL it used, and this also avoids the issue with the hostname being in an encrypted HTTP header at the beginning of the TLS handshake. The istio-ingressgateway service is SNI-aware, meaning it can use the hostname in the SNI to drive hostname-based gateway routing.
The next question was, how do we get the F5 to include the proper SNI in its TLS handshake with the istio-ingressgateway service? This is where the iRules feature of F5 BIG-IP load balancer proved handy. iRules allow scripted action on network traffic passing through the BIG-IP. We added a simple iRule to our virtual server to take the host header from the incoming traffic from clients, and add an SNI to the outgoing TLS traffic to the istio-ingressgateway pods using that hostname. Here’s the actual iRule code (source: https://support.f5.com/csp/article/K41600007):
when HTTP_REQUEST {
#Set the SNI value (e.g. HTTP::host)
set sni_value [getfield [HTTP::host] ":" 1]
}
when SERVERSSL_CLIENTHELLO_SEND {
# SNI extension record as defined in RFC 3546/3.1
#
# - TLS Extension Type = int16( 0 = SNI )
# - TLS Extension Length = int16( $sni_length + 5 byte )
# - SNI Record Length = int16( $sni_length + 3 byte)
# - SNI Record Type = int8( 0 = HOST )
# - SNI Record Value Length = int16( $sni_length )
# - SNI Record Value = str( $sni_value )
#
# Calculate the length of the SNI value, Compute the SNI Record / TLS extension fields
# and add the result to the SERVERSSL_CLIENTHELLO
SSL::extensions insert [binary format SSScSa* 0 [expr { [set sni_length [string length $sni_value]] + 5 }] [expr { $sni_length + 3 }] 0 $sni_length $sni_value]
}
With the iRule in place, the traffic now looked as below:
The addition of the SNI with the original client-requested hostname to the outbound F5 traffic immediately corrected the issue. Our Istio service mesh ingress gateway configurations now work as designed, allowing us to perform hostname based service routing at the mesh edge, while keeping the traffic encrypted by way of the F5 SSL bridge.
Conclusion
The indirection between the client and the istio-ingressgateway by way of a load balancer in between the two can create a challenge for hostname-based gateway routing at the edge of the service mesh, particularly when the load balancer is configured to perform SSL bridging between the two. The SNI extension of the TLS protocol can help solve the problem. If your load balancer software or appliance automatically adds the SNI to outbound TLS traffic, you might not even see the problem. However, if you are using a load balancing technology like the F5 BIG-IP, you might need to explicitly configure SNI addition to outbound traffic using a scripted solution like iRules.