Last week, our team was working on a feature enhancement to Kube360. We work with clients in regulated industries, and one of the requirements was fully encrypted traffic throughout the cluster. While we've supported Istio's mutual TLS (mTLS) as an optional feature for end-user applications, not all of our built-in services were using mTLS strict mode. We were working on rolling out that support.
One of the cornerstones of Kube360 is our centralized authentication system, which is primarily supplied by a service (called k3dash
) that receives incoming traffic, performs authentication against an external identity provider (such as Okta, Azure AD, or others), and then provides those credentials to the other services within the clusters, such as the Kubernetes Dashboard or Grafana. This service in particular was giving some trouble.
Before diving into the bugs and the debugging journey, however, let's review both Istio's mTLS support and relevant details of how k3dash
operates.
Interested in solving these kinds of problems? We're looking for experienced DevOps engineers to join our global team. We're hiring globally, and particularly looking for another US lead engineer. If you're interesting, send your CV to [email protected].
What is mTLS?
In a typical Kubernetes setup, encrypted traffic comes into the cluster and hits a load balancer. That load balancer terminates the TLS connection, resulting in the decrypted traffic. That decrypted traffic is then sent to the relevant service within the cluster. Since traffic within the cluster is typically considered safe, for many use cases this is an acceptable approach.
But for some use cases, such as handling Personally Identifiable Information (PII), extra safeguards may be desired or required. In those cases, we would like to ensure that all network traffic, even traffic inside the same cluster, is encrypted. That gives extra guarantees against both snooping (reading data in transit) and spoofing (faking the source of data) attacks. This can help mitigate the impact of other flaws in the system.
Implementing this complete data-in-transit encryption system manually requires a major overhaul to essentially every application in the cluster. You'll need to teach all of them to terminate their own TLS connections, issue certificates for all applications, and add a new Certificate Authority for all applications to respect.
Istio's mTLS handles this outside of the application. It installs a sidecar that communicates with your application over a localhost connection, bypassing exposed network traffic. It uses sophisticated port forwarding rules (via IP tables) to redirect incoming and outgoing traffic to and from the pod to go via the sidecar. And the Envoy sidecar in the proxy handles all the logic of obtaining TLS certificates, refreshing keys, termination, etc.
The way Istio handles all of this is pretty incredible. When it works, it works great. And when it fails, it can be disastrously difficult to debug. Which is what happened here (though thankfully it took less than a day to get to a conclusion). In the realm of epic foreshadowment, let me point out three specific points about Istio's mTLS worth mentioning.
- In strict mode, which is what we're going for, the Envoy sidecar will reject any incoming plaintext communication.
- Something I hadn't recognized at first, but now have fully internalized: normally, if you make an HTTP connection to a host that doesn't exist, you'll get a failed connection error. You definitely won't get an HTTP response. With Istio, however, you'll always make a successful outgoing HTTP connection, since your connection is going to Envoy itself. If the Envoy proxy cannot make the connection, it will return an HTTP response body with a 503 error message, like most proxies.
- The Envoy proxy has special handling for some protocols. Most importantly, if you make a plaintext HTTP outgoing connection, the Envoy proxy has sophisticated abilities to parse the outgoing request, understand details about various headers, and do intelligent routing.
OK, that's mTLS. Let's talk about the other player here: k3dash
.
k3dash
and reverse proxying
The primary method k3dash
uses to provide authentication credentials to other services inside the cluster is HTTP reverse proxying. This is a common technique, and common libraries exist for doing it. In fact, I wrote one such library years ago. We've already mentioned a common use case of reverse proxying: load balancing. In a reverse proxy situation, incoming traffic is received by one server, which analyzes the incoming request, performs some transformations, and then chooses a destination service to forward the request to.
One of the most important aspects of reverse proxying is header management. There are a few different things you can do at the header level, such as:
- Remove hop-by-hop headers, such as
transfer-encoding
, which apply to a single hop and not the end-to-end communication between client and server.
- Inject new headers. For example, in
k3dash
, we regularly inject headers recognized by the final services for authentication purposes.
- Leave headers completely untouched. This is often the case with headers like
content-type
, where we typically want the client and final server to exchange data without any interference.
As one epic foreshadowment example, consider the Host
header in a typical reverse proxy situation. I may have a single load balancer handling traffic for a dozen different domain names, including domain names A
and B
. And perhaps I have a single service behind the reverse proxy serving the traffic for both of those domain names. I need to make sure that my load balancer forwards on the Host
header to the final service, so it can decide how to respond to the request.
k3dash
in fact uses the library linked above for its implementation, and is following fairly standard header forwarding rules, plus making some specific modifications within the application.
I think that's enough backstory, and perhaps you're already beginning to piece together what went wrong based on my clues above. Anyway, let's dive in!
The problem
One of my coworkers, Sibi, got started on the Istio mTLS strict mode migration. He got strict mode turned on in a test cluster, and then began to figure out what was broken. I don't know all the preliminary changes he made. But when he reached out to me, he'd gotten us to a point where the Kubernetes load balancer was successfully receiving the incoming requests for k3dash
and forwarding them along to k3dash
. k3dash
was able to log the user in and provide its own UI display. All good so far.
However, following through from the main UI to the Kubernetes Dashboard would fail, and we'd end up with this error message in the browser:
upstream connect error or disconnect/reset before headers. reset reason: connection failure
Sibi believed this to be a problem with the k3dash
codebase itself and asked me to step in to help debug.
The wrong rabbit hole, and incredible laziness
This whole section is just a cathartic gripe session on how I foot-gunned myself. I'm entirely to blame for my own pain, as we're about to see.
It seemed pretty clear that the outgoing connection from the k3dash
pod to the kubernetes-dashboard
pod was failing. (And this turned out to be a safe guess.) The first thing I wanted to do was make a simpler repro, which in this case involved kubectl exec
ing into the k3dash
container and curl
ing to the in-cluster service endpoint. Essentially:
$ curl -ivvv http://kube360-kubernetes-dashboard.kube360-system.svc.cluster.local/
* Trying 172.20.165.228...
* TCP_NODELAY set
* Connected to kube360-kubernetes-dashboard.kube360-system.svc.cluster.local (172.20.165.228) port 80 (#0)
> GET / HTTP/1.1
> Host: kube360-kubernetes-dashboard.kube360-system.svc.cluster.local
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 503 Service Unavailable
HTTP/1.1 503 Service Unavailable
< content-length: 84
content-length: 84
< content-type: text/plain
content-type: text/plain
< date: Wed, 14 Jul 2021 15:29:04 GMT
date: Wed, 14 Jul 2021 15:29:04 GMT
< server: envoy
server: envoy
<
* Connection #0 to host kube360-kubernetes-dashboard.kube360-system.svc.cluster.local left intact
upstream connect error or disconnect/reset before headers. reset reason: local reset
This reproed the problem right away. Great! I was now completely convinced that the problem was not k3dash
specific, since neither curl
nor k3dash
could make the connection, and they both gave the same upstream connect error
message. I could think of a few different reasons for this to happen, none of which were correct:
- The outgoing packets from the container were not being sent to the Envoy proxy. I strongly believed this one for a while. But if I'd thought a bit harder, I would have realized that this was completely impossible. That
upstream connect error
message was of course coming from the Envoy proxy itself! If we were having a normal connection failure, we would have received the error message at the TCP level, not as an HTTP 503 response code. Next!
- The Envoy sidecar was receiving the packets, but the mesh was confused enough that it couldn't figure out how to connect to the destination Envoy sidecar. This turned out to be partially right, but not in the way I thought.
I futzed around with lots of different attempts here but was essentially stalled. Until Sibi noticed something fascinating. It turns out that the following, seemingly nonsensical command did work:
curl http://kube360-kubernetes-dashboard.kube360-system.svc.cluster.local:443/
For some reason, making an insecure HTTP request over 443, the secure HTTPS port, worked. This made no sense, of course. Why would using the wrong port fix everything? And this is where incredible laziness comes into play. You see, Kubernetes Dashboard's default configuration uses TLS, and requires all of that setup I mentioned above about passing around certificates and updating accepted Certificate Authorities. But you can turn off that requirement, and make it listen on plain text. Since (1) this was intracluster communication, and (2) we've always had strict mTLS on our roadmap, we decided to simply turn off TLS in the Kubernetes Dashboard. However, when doing so, I forgot to switch the port number from 443 to 80.
Not to worry though! I did remember to correctly configure k3dash
to communicate with Kubernetes Dashboard, using insecure HTTP, over port 443. Since both parties agreed on the port, it didn't matter that it was the wrong port.
But this was all very frustrating. It meant that the "repro" wasn't a repro at all. curl
ing on the wrong port was giving the same error message, but for a different reason. In the meanwhile, we went ahead and changed Kubernetes Dashboard to listen on port 80 and k3dash
to connect on port 80. We thought there may be a possibility that the Envoy proxy was giving some special treatment to the port number, which in retrospect doesn't really make much sense. In any event, this ended at a situation where our "repro" wasn't a repro at all.
The bug is in k3dash
Now it was clear that Sibi was right. curl
could connect, k3dash
couldn't. The bug must be inside k3dash
. But I couldn't figure out how. Being the author of essentially all the HTTP libraries involved in this toolchain, I began to worry that my HTTP client library itself may somehow be the source of the bug. I went down a rabbit hole there too, putting together some minimal sample program outside k3dash
. I kubectl cp
ed them over and then ran them... and everything worked fine. Phew, my libraries were working, but not k3dash
.
Then I did the thing I should have done at the very beginning. I looked at the logs very, very carefully. Remember, k3dash
is doing a reverse proxy. So, it receives an incoming request, modifies it, makes the new request, and then sends a modified response back. The logs included the modified outgoing HTTP request (some fields modified to remove private information):
2021-07-15 05:20:39.820662778 UTC ServiceRequest Request {
host = "kube360-kubernetes-dashboard.kube360-system.svc.cluster.local"
port = 80
secure = False
requestHeaders = [("X-Real-IP","127.0.0.1"),("host","test-kube360-hostname.hidden"),("upgrade-insecure-requests","1"),("user-agent","<REDACTED>"),("accept","text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9"),("sec-gpc","1"),("referer","http://test-kube360-hostname.hidden/dash"),("accept-language","en-US,en;q=0.9"),("cookie","<REDACTED>"),("x-forwarded-for","192.168.0.1"),("x-forwarded-proto","http"),("x-request-id","<REDACTED>"),("x-envoy-attempt-count","3"),("x-envoy-internal","true"),("x-forwarded-client-cert","<REDACTED>"),("Authorization","<REDACTED>")]
path = "/"
queryString = ""
method = "GET"
proxy = Nothing
rawBody = False
redirectCount = 0
responseTimeout = ResponseTimeoutNone
requestVersion = HTTP/1.1
}
I tried to leave in enough content here to give you the same overwhelmed sense that I had looking it. Keep in mind the requestHeaders
field is in practice about three times as long. Anyway, with the slimmed down headers, and all my hints throughout, see if you can guess what the problem is.
Ready? It's the Host
header! Let's take a quote from the Istio traffic routing documentation. Regarding HTTP traffic, it says:
Requests are routed based on the port and Host
header, rather than port and IP. This means the destination IP address is effectively ignored. For example, curl 8.8.8.8 -H "Host: productpage.default.svc.cluster.local"
, would be routed to the productpage
Service.
See the problem? k3dash
is behaving like a standard reverse proxy, and including the Host
header, which is almost always the right thing to do. But not here! In this case, that Host
header we're forwarding is confusing Envoy. Envoy is trying to connect to something (test-kube360-hostname.hidden
) that doesn't respond to its mTLS connections. That's why we get the upstream connect error
. And that's why we got the same response as when we used the wrong port number, since Envoy is configured to only receive incoming traffic on a port that the service is actually listening to.
The fix
After all of that, the fix is rather anticlimactic:
-(\(h, _) -> not (Set.member h _serviceStripHeaders))
+-- Strip out host headers, since they confuse the Envoy proxy
+(\(h, _) -> not (Set.member h _serviceStripHeaders) && h /= "Host")
We already had logic in k3dash
to strip away specific headers for each service. And it turns out this logic was primarily used to strip out the Host
header for services that got confused when they saw it! Now we just need to strip away the Host
header for all the services instead. Fortunately none of our services perform any logic based on the Host
header, so with that in place, we should be good. We deployed the new version of k3dash
, and voilà! everything worked.
The moral of the story
I walked away from this adventure with a much better understanding of how Istio interacts with applications, which is great. I got a great reminder to look more carefully at log messages before hardening my assumptions about the source of a bug. And I got a great kick in the pants for being lazy about port number fixes.
All in all, it was about six hours of debugging fun. And to quote a great Hebrew phrase on it, "היה טוב, וטוב שהיה" (it was good, and good that it was (in the past)).
As I mentioned above, we're actively looking for new DevOps candidates, especially US based candidates. If you're interested in working with a global team of experienced DevOps, Rust, and Haskell engineers, consider sending us your CV.
And if you're looking for a solid Kubernetes platform, batteries included, so you can offload this kind of tedious debugging to some other unfortunate souls (read: us), check out Kube360.
If you liked this article, you may also like:
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.
Do you like this blog post and need help with Next Generation Software Engineering, Platform Engineering or Blockchain & Smart Contracts? Contact us.