Control-Plane Protection & Control-Plane Policing
It’s not very common to see people jump on the idea of configuring Control-Plane Policing/Protection, a part of me thinks people avoid this subject like the plague because they feel it causes more problems then it is worth. Well, let’s be honest if you have had to troubleshoot a CoPP or CPPr policy you know it is fun process. Especially since checking the control-plane is not usually the first thing everyone looks at and half the time issuing a ‘sh run’ is just not an option at first.
The first thing we should probably clear up is, why should we protect the control-plane what is this going to do for us. Well let’s consider a few things the control-plane does:
- Handles packets that are not CEF switched, meaning the CPU has to take some time handle these packets.
- This is more important than you think, if the CPU is getting bombarded with a large number of packets the CPU must handle each one individually & it is possible the CPU will get too busy it start dropping other traffic.
- Maintains keep-alives for routing adjacencies.
- Handle traffic directed at the device itself
- SNMP/SSH, management traffic.
- The more you learn about Software Defined Networking (SDN) the better! In this video I show you the how and why we separate the control-plane from the local.
- This guide walks you through the installation of an external control plane. The external control plane deployment model enables mesh operators to install and manage mesh control planes on separate external clusters. This deployment model allows a clear.
The control plane does a bit more then that but the three points above should get the point across.
The next thing I want to mention is how Control-Plane Protection (CPPr) differs from Control-Plane Policing (CoPP). Probably the main difference is the fact with CoPP you control access and limit access to the entire control-plane. This sounds good and simple but the control-plane is slightly more complex then that (go figure right). CPPr on the other hand allows us to control access to the individual control-plane sub-interfaces, providing us with more direct control. Here is a diagram from Cisco.com that lays out the control-plane:
As you can see from the above diagram, applying a control-plane policy (CoPP) applies an aggregate policer to all traffic destined for the CPU. Reaching out of that aggregate you can see three addition sub-interfaces of the control-plane:
- Host – The host sub-interface handles traffic destined for the router or one of its own interfaces. IE: Mgmt related traffic and some routing protocol traffic. (EIGRP iBGP)
- Transit – This sub-interface handles software switched IP traffic. (I think also ICMP unreachable/redirects but I need hammer away at the ‘transit sub-interface’ a but more in the lab)
- CEF-Exception – This sub-interfaces typical handles non-IP related packets such as ARP, LDP, Layer 2 keepalives along with some routing protocol traffic. (OSPF eBGP)
Contract: ControlPlane resource Most Kubernetes clusters require a cloud-controller-manager or CSI drivers in order to work properly. Before introducing the ControlPlane extension resource Gardener was having several different Helm charts for the cloud-controller-manager deployments for the various providers. Now, Gardener commissions an external, provider-specific controller to take over this. By default master node is tainted (means no pod or workload will be scheduled on master node. And this is best practices because master node meant to run cluster component like ETCD, kubeapi-server etc. And all other application related pods should go onto worker nodes ) so that's why by default taint applied on master node.
Now, that we have an understanding of what the control-plane does for us, and the differences between CoPP vs CPPr let’s jump into some configurations. Luckily this configuration follows the framework of a typical QoS policy so if you familiar with the structure of the MQC you should be able to follow right along.
First we are going to create a few ACL’s to match our traffic:
Let’s put those ACL’s inside a few Class-Maps: (Only ACL’s, match ip DSCP/Precedence, & match protocol ARP are supported, hence why I did not do match protocol OSPF/BGP, if you do the command will get rejected upon trying to apply the service-policy to the control-plane)
Now, we reference the Class-Maps within a Policy-Map and define our actions: (I’d like to make note, with a CPPr Policy-Map you can only use the ‘police’ or ‘drop’ actions)
Finally we apply a service-policy referencing the Policy-Map we just created.
Now, that applies a CPPr policy to two different control-plane interfaces, if you simply want to perform CoPP you could do the following:
Apply the Policy-Map to aggregate control-plane
Notice the console message, that CoPP has been enabled on the aggregate path. The CoPP policy shown in the above two pictures just about accomplishes the same thing as our CPPr Policy (With a few exceptions I want you to point out).
A few links to CoPP and CPPr from Cisco.com.
Most Kubernetes clusters require a cloud-controller-manager
or CSI drivers in order to work properly.Before introducing the ControlPlane
extension resource Gardener was having several different Helm charts for the cloud-controller-manager
deployments for the various providers.Now, Gardener commissions an external, provider-specific controller to take over this task.
Which control plane resources are required?
As mentioned in the controlplane customization webhooks document Gardener shall not deploy any cloud-controller-manager
or any other provider-specific component.Instead, it creates a ControlPlane
CRD that should be picked up by provider extensions.Its purpose is to trigger the deployment of such provider-specific components in the shoot namespace in the seed cluster.
Control Plane And Data Plane
What needs to be implemented to support a new infrastructure provider?
As part of the shoot flow Gardener will create a special CRD in the seed cluster that needs to be reconciled by an extension controller, for example:
Control Panel
The .spec.secretRef
contains a reference to the provider secret pointing to the account that shall be used for the shoot cluster.However, the most important section is the .spec.providerConfig
and the .spec.infrastructureProviderStatus
.The first one contains an embedded declaration of the provider specific configuration for the control plane (that cannot be known by Gardener itself).You are responsible for designing how this configuration looks like.Gardener does not evaluate it but just copies this part from what has been provided by the end-user in the Shoot
resource.The second one contains the output of the Infrastructure
resource (that might be relevant for the CCM config).
In order to support a new control plane provider you need to write a controller that watches all ControlPlane
s with .spec.type=<my-provider-name>
.You can take a look at the below referenced example implementation for the Alicloud provider.
The control plane controller as part of the ControlPlane
reconciliation, often deploys resources (e.g. pods/deployments) into the Shoot namespace in the Seed
as part of its ControlPlane
reconciliation loop.Because the namespace contains network policies that per default deny all ingress and egress traffic,the pods may need to have proper labels matching to the selectors of the network policies in order to allow the required network traffic.Otherwise, they won’t be allowed to talk to certain other components (e.g., the kube-apiserver of the shoot).Please see this document for more information.
Non-provider specific information required for infrastructure creation
Control Plane Vs Data Plane
Most providers might require further information that is not provider specific but already part of the shoot resource.One example for this is the GCP control plane controller which needs the Kubernetes version of the shoot cluster (because it already uses the in-tree Kubernetes cloud-controller-manager).As Gardener cannot know which information is required by providers it simply mirrors the Shoot
, Seed
, and CloudProfile
resources into the seed.They are part of the Cluster
extension resource and can be used to extract information that is not part of the Infrastructure
resource itself.