-
Notifications
You must be signed in to change notification settings - Fork 342
v0.6 Release Blog
KubeArmor already has support for K8s orchestrated and Bare-Metal/VM workloads. With the v0.6 stable release, KubeArmor will also support un-orchestrated containerized workloads. KubeArmor supports both observability and policy enforcement in this mode.
KubeArmor recently did a POC with LFEdge Open Horizon project. Open Horizon supports using containerized workloads on the edge such that multiple applications from different vendors could be deployed on the edge node as different containers. It is imperative that the security aspects of such a multi-tenant solution needs to be taken into consideration. It is required that the security gaps in one of the container should not lead to compromises in other containers or at the host level. Container isolation and hardening has to be ensured such that the blast radius and containment of security flaws is localized.
Discovering and enforcing least-permissive policies...
For the enforcement, KubeArmor generates AppArmor profiles for individual containers based on the policy specified. The containers are required to start with the AppArmor profiles attached (using security-opt apparmor='profile-name'). These profiles can later be dynamically updated by KubeArmor to insert, modify or remove the AppArmor enforcement. Un-orchestrated workloads have a real use case in edge devices where orchestration is hard due to resource restrictions. KubeArmor can now help protect such environment.
Whitelisting is a security strategy where you predefine all the entities that are to be permitted to execute and access resources. It is a fairly extreme containment measure that, if properly implemented, can prevent many security issues. KubeArmor supports whitelisting by leveraging "Allow" Action in the Policy Specification. But as mentioned it's an extreme containment measure and is fairly hard to implement. What if we want to confine what processes a particular resource can be accessed by. With v0.6 release, KubeArmor has a way to confine what processes a particular resource can be accessed by.
file:
matchDirectories:
- dir: /run/secrets/kubernetes.io/serviceaccount/
recursive: true
action: Block # Block access to service account token to everyone
- dir: /
recursive: true
- dir: /run/secrets/kubernetes.io/serviceaccount/
recursive: true
fromSource:
- path: /bin/cat # Allow access to service account token to only cat
process:
matchDirectories:
- dir: /
recursive: true # Allow all other process execution in general
Service Account Tokens are automounted to Pods, it helps provide better access to the Kubernetes API server. But this token becomes problematic if an attacker gains access to a container via some other exploit. We can set automountServiceAccountToken: false
but for Pods where Service Account Tokens are needed, the token is still exposed to all the entities available inside the pod. The above KubeArmor Policy helps restrict the access to Service Account Token to some particular binaries (in this case /bin/cat
).
In v0.6, we profiled KubeArmor using pprof and did some major performance improvements such as:
When Containerd is being used as the runtime, KubeAmor uses the containerd client for monitoring containers in the cluster. However, the container monitor was looking for new containers too frequently and calling a particular time consuming method. Reducing this frequency saved us a lot of CPU cycles.
With the migration to Cilium eBPF we decreased KubeArmor's memory usage. Checkout BCC deprectation in favor of cilium/ebpf section for more details.
In addition to CPU and memory optimizations we have increased our perf buffer size to prevent events loss due to high events rate.
We were able to reduce KubeArmor ressource consumption drastically from:
to this 🎉
To discover all the improvments that we implemented please checkout this wiki article Performance improvements in v0.6
Starting from v0.6, KubeArmor introduced the ability to explicitly monitor for system calls and alert based on rules set by the user. The system calls rules matching engine offers multiple options for our users to slice and dice system calls to obtain useful informations about their systems.
Our users can set policies to alerts for system call based on many criterias such as:
- system call name
- system call source (binary or directory)
- system call target (binary or directory)
In this example we want to watch for file deletions via the unlink
system calls that impacts any directory under /home/
.
KubeArmorPolicy:
apiVersion: security.kubearmor.com/v1
kind: KubeArmorPolicy
metadata:
name: audit-home-rmdir
namespace: default
spec:
severity: 3
selector:
matchLabels:
container: ubuntu-1
syscalls:
matchPaths:
- syscall:
- rmdir
path: /home/
recursive: true
action:
Audit
Generated telemetry
{
"Timestamp": 1661936575,
"UpdatedTime": "2022-08-31T09:02:55.841537Z",
"ClusterName": "default",
"HostName": "vagrant",
"NamespaceName": "default",
"PodName": "ubuntu-1-6779f689b5-jjcvh",
"Labels": "container=ubuntu-1",
"ContainerID": "1f613df8390b9d2e4e89d0323ac0b9a2e7d7ddcc460720e15074f8c497aec0df",
"ContainerName": "nginx",
"ContainerImage": "nginx:latest@sha256:b95a99feebf7797479e0c5eb5ec0bdfa5d9f504bc94da550c2f58e839ea6914f",
"HostPPID": 255296,
"HostPID": 302715,
"PPID": 47,
"PID": 67,
"ParentProcessName": "/bin/bash",
"ProcessName": "/bin/rmdir",
"PolicyName": "audit-home-rmdir",
"Severity": "3",
"Type": "MatchedPolicy",
"Source": "/bin/rmdir home/jane-doe/",
"Operation": "Syscall",
"Resource": "/home/jane-doe",
"Data": "syscall=SYS_RMDIR",
"Action": "Audit",
"Result": "Passed"
}
For more informations, Please checkout our policy specification documentation.
KubeArmorPolicy specification: https://docs.kubearmor.com/kubearmor/getting-started/security_policy_specification
KubeArmorHostPolicy specification: https://docs.kubearmor.com/kubearmor/getting-started/host_security_policy_specification
KubeArmor uses eBPF to trace various kernel events and gain visibility on what's happenning. We have been leveraging the BCC ( BPF Compiler Collection ) Framework to interact with the kernel through eBPF. BCC is going to great lengths to simplify BPF developer’s life, but sometimes that extra convenience gets in the way and makes it actually harder to figure out what’s wrong and how to fix it.
BCC includes Clang and LLVM with it and executes compilation from the main program at runtime causing sudden heavy resource utilisation which sometimes result crashes in our container due resource limitations. Also there was an added dependency on Kernel Headers which needed to be manually installed on the Host and made available to KubeArmor for it's working.
In this release, we have migrated off from BCC to libbpf
and cilium/ebpf
, decoupled compilation of eBPF Object files from main KubeArmor container to an init container.
Some benefits that come with this migration include:
- No sudden outbursts of resource utilisation in the main container
- Lower Resource Utilisation
- No Dependency on Kernel Header if kernel supports BTF ( BPF Type Format ). BTF is generally available on kernel version > 5.2, for kernel's without BTF information we still require Kernel Headers.
- No CGO dependency in KubeArmor binary, making our systemd releases more portable and not dependent on certain glibc versions anymore.
- Lighter KubeArmor binary with reduced attack surface, due to removal of the llvm toolchain and BCC from the kubearmor container.