Kubernetes Cilium Service Announcement: In-Depth L2 and BGP Comparison

Irki Muzak
7 min readOct 5, 2023

--

Imagine you’ve successfully set up your Kubernetes environment — everything’s running smoothly, your applications are deployed, and you’re ready to roll. But then, you encounter a common challenge that often follows this initial triumph: enabling external access to your Kubernetes services.

In this article, we’ll delve into the solutions to this predicament and explore the underlying technologies that empower us to make our Kubernetes services accessible to the world: Layer 2 announcement and BGP load balancing, each offering unique advantages in achieving this goal.

Our Current Kubernetes Environment

Kubernetes topology
Kubernetes topology
  • Here, we have a fully operational Kubernetes setup with Cilium as the container network interface (CNI), configured within a Proxmox Virtual Environment. Refer to this official documentation for installation details.
  • Virtual router (which I will refer to as router) to centralize the gateway traffic into our Kubernetes cluster ensuring all necessary configurations reside within the Proxmox VE. Although not mandatory, a physical router can also be used as an alternative.
  • Kubernetes services (SVC) with its external IP.
a simple nginx service with external IP
a simple nginx service with external IP

However, those IP addresses can only be accessed from the nodes. External traffic and even the virtual router remain unaware of how to reach the services.

External traffics can’t access the service
External traffics can’t access the service

The Objectives

Our primary aim is to make services accessible from the router.

Objective
Objective

So, What’s the Solution?

It’s simple: we need to inform the router that services are reachable from the nodes.

Solution

In the illustration above, the crucial step is the first one. If we have successfully advertised the services via nodes, the router will understand to send any packet to the services through the nodes initially.

It’s worth noting that the router can choose any node as the packet destination, a concept we’ll explore further below.

Currently, we have two well-established solutions to achieve this in cilium: Layer 2 announcement and BGP load balancing.

Layer 2 announcement

Let’s begin by examining the foundation of this solution, which is the ARP protocol.

Analogous Scenario

Imagine two people, Alice and Bob. If Alice intends to send a gift to Bob, she must first ascertain Bob’s residence.

How can Alice know of Bob’s address? it’s simple: she asks!

  • Alice inquires about Bob’s address.
  • Bob responds with his address.
  • Alice records Bob’s address.
  • Alice can now send the gift!
ARP Anaolgous
ARP Anaolgous

We can apply this same concept to computer communication.

ARP protocol

The ARP (Address Resolution Protocol) serves the purpose of obtaining the MAC address corresponding to an IP address.

ARP behind the scene
ARP behind the scene

The example above is nearly identical, with the only difference being the use of IP addresses as identifiers and MAC addresses as actual addresses.

Communication begins with an ARP request, which seeks the MAC address and receives an ARP reply containing the MAC address. This pair of IP and MAC addresses is then stored in the ARP table.

How about our Kubernetes’ case?

ARP is unusable for the case of kubernetes
ARP is unusable for the case of kubernetes

In our topology, the router won’t initiate the communication with an ARP request, as shown in the illustration above. It means the whole communication won’t be started and the services remain inaccessible from the router and external traffic.

The solution involves bypassing the request phase and proceeding directly to the reply.

So, Do Nodes Send ARP Replies Without Prior Router Requests?

Precisely. An ARP reply without any preceding ARP request from the router is termed a gratuitous ARP reply. We employ this to notify the router that the SVC is accessible from the nodes, even without the router’s inquiry.

How gratuitous ARP solves the problem
How gratuitous ARP solves the problem

Reviewing our goal illustration, the outcome remains the same. We’ve achieved our objective of informing the router of the SVC’s location, enabling the router to access the SVC.

Failover, Not Load Balancer

Due to how Layer 2 ARP works, all traffic to a service will only handled by one node while the others remain in standby mode.

ARP table example
ARP table example

The screenshot above displays an example ARP table on a Linux machine generated by the arp command. Each IP address corresponds to a single MAC address, indicating that for the same SVC IP address, the ARP table stores only one node's MAC address.

Because of this reason, the SVC will be accessible from only one node from the router. The process of selecting which node communicates with the router is called election where the winner will obtain a lease for the SVC.

L2 leader selection
L2 leader selection

You can check which nodes are selected as leaders for each service by running

kubectl get leases --all-namespaces

Here I’m using k9s to display the leases.

Leases
Leases

In my example, Master-1 is consistently selected as the leader for all services, creating an imbalance despite having three master nodes.

This is the reason the L2 advertisement is more like a failover, not a load balancing. Moreover, a node may handle way more traffic compared to others.

L2 summary

  • + Relatively easy to set up, requiring only advertising from Kubernetes without router configuration.
  • - Limited network expansion options, as network advertising relies on ARP, not a routing protocol (explained further in the BGP section).
  • - Like explained above, Primarily functions as failover rather than load balancing.

BGP load balancing

BGP (Border Gateway Protocol) stands as a dynamic routing protocol extensively used in global networking. Chances are, you’re accessing this article via the internet, which most likely employs BGP somewhere in its network.

Dynamic routing

Simple dynamic routing analogous
Simple dynamic routing analogous

Suppose Alice knows Bob and Bob knows Cindy.

Without dynamic routing, Alice and Cindy won’t know that they can communicate with each other by using Bob as the intermediate.

However, dynamic routing rectifies this issue by informing both Alice and Cindy that they can communicate through Bob.

How Does It Work?

In BGP routing, we refer to each node and router as peers. Both sides of BGP peering must be connected by registering them as neighbors. Those connections will have many attributes that help BGP determine which route will considered the best route, including weight, distance, router ID, and many more.

BGP peering
BGP peering

You can verify BGP connectivity using the Cilium CLI command cilium bgp peers. A successful BGP peering connection will display an established session state.

You may find any other states includes:

  • active: indicates BGP is actively looking for BGP peers. Usually occurs when the router has not been set up with a correct BGP setting and neighbors
  • idle: means BGP stopped looking for peers. The last time I found this was because of a wrong node’s IP configuration which led to IP address conflict.

Load Balancing, Not Just failover

each route has identical attributes
each route has identical attributes

Each connection between the router and every node in the case of this Kubernetes environment shares the same configuration, including distance, autonomous number, and other attributes. Furthermore, each of these nodes can direct traffic to the same SVC.

When identical routes lead to the same destination, BGP incorporates a feature known as ECMP (Equal Cost Multi-Path). This feature evenly distributes traffic to the SVC through all nodes with BGP peering for identical routes which is the core of how load balancing in BGP-based advertisement.

how BGP load balancer works
how BGP load balancer works

Scalable

When it comes to BGP advertisements, we can interconnect this dynamic route within Kubernetes to external dynamic routes, not only to BGP but also to another dynamic routing protocol like OSPF.

BGP scalability
BGP scalability

BGP summary

  • + Supports real load balancer
  • + Offers scalability by connecting BGP to external traffic for network expansion.
  • - Can be relatively complex for small-scale projects and may require more extensive setup knowledge.

Conclusion

Throughout this article, we have discussed in detail of how layer 2 advertisement and BGP load balancer works and comparing their functionality. However, we’ve only scratched the surface without diving into the nitty-gritty details and real-world examples.

What’s Next on the Horizon?

The journey doesn’t end here. Stay on the lookout for the forthcoming detailed configuration guides for both Layer 2 and BGP. I’m eager to provide you with the practical insights you need to navigate the realm of Kubernetes connectivity effectively.

--

--