Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container Checkpointing with CRIU in Kubernetes #2563

Open
vd2892001 opened this issue Jan 10, 2025 · 6 comments
Open

Container Checkpointing with CRIU in Kubernetes #2563

vd2892001 opened this issue Jan 10, 2025 · 6 comments

Comments

@vd2892001
Copy link

Container Checkpointing with CRIU in Kubernetes:
Kubernetes version: 1.28
Cloud being used: (put bare-metal if not on a public cloud)
Installation method: Container Checkpointing with CRIU in Kubernetes
Host OS: ubuntu 22.04
CRI and version: CRI-O
criu : 4.0

enable_criu_support field in the CRI-O configuration file to true.

curl -sk -X POST “https://localhost:10250/checkpoint/default/nginx/nginx”
–key /etc/kubernetes/pki/apiserver-kubelet-client.key
–cacert /etc/kubernetes/pki/ca.crt
–cert /etc/kubernetes/pki/apiserver-kubelet-client.crt

error
ls -l /var/lib/kubelet/checkpoints/
ls: cannot access ‘/var/lib/kubelet/checkpoints/’: No such file or directory

Why doesn’t the checkpoint file appear in the directory? What configuration is needed to make it appear in the directory? Please help

@adrianreber
Copy link
Member

With Kubernetes 1.28 the feature is still marked as Alpha and needs to be explicitly enabled. Starting with Kubernetes 1.30 it defaults to on. You need to activate it.

@vd2892001
Copy link
Author

drop_infra_ctr = false
enable_criu_support = true
I have enabled it in the configuration

@adrianreber
Copy link
Member

You need to also enable it in Kubernetes.

@vd2892001
Copy link
Author

kube-apiserver.yaml: - --feature-gates=ContainerCheckpoint=true
kube-controller-manager.yaml: - --feature-gates=ContainerCheckpoint=true
kube-scheduler.yaml: - --feature-gates=ContainerCheckpoint=true
I activated it in kubenetes but still error

@adrianreber
Copy link
Member

You need to activate it also for the kubelet.

@vd2892001
Copy link
Author

vd2892001 commented Jan 10, 2025

How can I activate it? Can you guide me?
I looked at the installation instructions and couldn't find them
Prerequisites
Kubernetes cluster: A v1.25+ Kubernetes cluster.
Container runtime: A container runtime that supports container checkpointing:
containerd: Supported from v2.0.
CRI-O: v1.25 has support for container checkpointing.
CRI-O configuration: To use checkpointing with CRI-O, the runtime needs to be started with the command-line option --enable-criu-support=true.
🛠️ Operations
Post Checkpoint the Specified Container
Tell the kubelet to checkpoint a specific container from the specified Pod. Consult the Kubelet authentication/authorization reference for more information about how access to the kubelet checkpoint interface is controlled.

The kubelet will request a checkpoint from the underlying CRI implementation. In the checkpoint request, the kubelet will specify the name of the checkpoint archive as checkpoint---.tar and also request to store the checkpoint archive in the checkpoints directory below its root directory (as defined by --root-dir). This defaults to /var/lib/kubelet/checkpoints.

The checkpoint archive is in tar format and can be listed using an implementation of tar. The contents of the archive depends on the underlying CRI implementation (the container runtime on that node).

HTTP Request
POST /checkpoint/{namespace}/{pod}/{container}
Parameters
namespace (in path): string, required
Namespace
pod (in path): string, required
Pod
container (in path): string, required
Containers
timeout (in query): integer
Timeout in seconds to wait until the checkpoint creation is finished. If zero or no timeout is specified, the default CRI timeout value will be used. Checkpoint creation time depends directly on the used memory of the container. The more memory a container uses, the more time is required to create the corresponding checkpoint.
Response
200: OK
401: Unauthorized
404: Not Found (if the ContainerCheckpoint feature gate is disabled)
404: Not Found (if the specified namespace, pod, or container cannot be found)
500: Internal Server Error (if the CRI implementation encounters an error during checkpointing)
500: Internal Server Error (if the CRI implementation does not implement the checkpoint CRI API)
Checkpointing
Once containers and pods are running, it is possible to create a checkpoint. Checkpointing is currently only exposed on the kubelet level. Triggering this kubelet API will request the creation of a checkpoint from CRI-O. CRI-O requests a checkpoint from your low-level runtime (for example, runc). Seeing that request, runc invokes the CRIU tool to do the actual checkpointing.
I don't see the configuration clearly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants