A TCP proxy between networks in GCP
Monday, February 8, 2021I worked on an interesting challenge recently and I thought it would be nice to write a blog post about it!
A little bit of context first, there are two Google Cloud projects (which could be in two different organizations) and each one of those have their own network. A client in Project A needs to access an API in project B but this API is not publicly exposed for security reasons.
We could think about pairing the two networks together but we would potentially have overlapping IP issues. It would also be possible to solve this problem by playing around in subnets configuration, but it would add a lot of complexity (IP subnet addressing rules, coordination between teams for network management) and it doesn’t really scale, for example, what if I have to pair with a third network in the future?
My solution is to use a proxy between the two networks so that they remain decoupled as much as possible.
Here we create a vpc-proxy in our project A, which will use a convenient subnet for project B. As this vpc-proxy is in our project, we can attach a VM network interface to it. As a result, we can create a VM with two network interfaces, connecting respectively to vpc-network-a and vpc-proxy.
A specific use case could be that the client in A is a Jenkins instance trying to deploy an application in a Kubernetes cluster in B. From Jenkins, we can point to the VM proxy IP in A which will route the request to the Kubernetes API server in B.
To configure the proxy, I did the following :
L3 and L4 refer to the network and transport layers in the OSI model, we can solve network communications at L3 with
proper routing tables that will make each network interface independent from the other.
Then we can create TCP forwarding rules at L4 with haproxy, this will allow us to create TCP mappings between
[proxy interface IP]
:[source port]
and [API destination IP]
:[API destination port]
.
Deep dive into the actual configuration! First, we need two routing tables, one for each network interface. Fortunately, the Google cloud documentation helps us with this.
Here is the setup in the proxy VM in my case:
# First routing table
echo "1 rt1" | sudo tee -a /etc/iproute2/rt_tables
sudo ip route add 192.168.0.1 src 192.168.0.2 dev ens4 table rt1
sudo ip route add default via 192.168.0.1 dev ens4 table rt1
sudo ip rule add from 192.168.0.2/32 table rt1
sudo ip rule add to 192.168.0.2/32 table rt1
# Second routing table
echo "2 rt2" | sudo tee -a /etc/iproute2/rt_tables
sudo ip route add 192.168.1.1 src 192.168.1.2 dev ens5 table rt2
sudo ip route add default via 192.168.1.1 dev ens5 table rt2
sudo ip rule add from 192.168.1.2/32 table rt2
sudo ip rule add to 192.168.1.2/32 table rt2
If you just run a simple ping from the proxy, the default gateway will be used and you will be able to
reach only one network from the proxy. In our case, nic0 (first interface = default gateway) is in network A, so a ping
to 10.3.0.3 will reach the client in A. For the routing tables we just created to kick in, we need to specify the IP
source in our commands. We can run sudo tcpdump -i ens4 -qtln icmp
in a target VM to verify that it receives our
ping requests.
Let’s test if the routing works in the proxy VM :
# Reaching the client
ping -I 192.168.0.2 10.3.0.3
# Reaching the server
ping -I 192.168.1.2 10.3.0.4
# And, for example, if we have a web server running in the server
curl --interface 192.168.1.2 http://10.3.0.4
The trick is to use the source IP, now we need to install haproxy (sudo apt install haproxy
) and configure our
mappings, which is easy to do!
Let’s edit /etc/haproxy/haproxy.cfg
global
defaults
timeout client 30s
timeout server 30s
timeout connect 30s
frontend network-a-frontend
bind 192.168.0.2:8000
default_backend network-b-server
backend network-b-server
mode tcp
source 192.168.1.2
server upstream 10.3.0.4:443
Notice that in the backend configuration, we specify the source IP to pick the right routing table. Also, we could have other frontend/backend entries to route TCP the other way around, from B to A.
Let’s deploy the configuration with sudo service haproxy restart
and check if this works from the client using
curl 192.168.0.2:8000
.
Hopefully it does 😄
We have now two decoupled networks that can communicate through TCP, it requires only to maintain the haproxy mappings to decide what goes where.
That setup feels very manual… but this could be done with Ansible as a second step!