пятница, 30 августа 2024 г.

k8s kubelet failed to reserve sandbox name

Sometimes containerd files gets corrupted. Why?..

Workaround.

systemctl stop kubelet
systemctl stop containerd


Remove corrupted containerd

mv /var/lib/containerd/ /var/lib/containerd_
mv /run/containerd/ /run/containerd_


Set a specific IP of any k8s api master to work without the missing nginx-proxy

grep localhost /etc/kubernetes/kubelet.conf
server: https://localhost:6443


Return the node to work

systemctl restart kubelet
systemctl restart containerd


Monitoring the start of pods

watch crictl ps


Revert kubelet.conf to localhost

Ceph pgs not deep-scrubbed in time

1. Change scrub_interval

ceph config set osd osd_deep_scrub_interval 1814400
ceph tell osd.* config set osd_deep_scrub_interval 1814400


2. Temporarily increase the number of active scrub tasks

ceph tell 'osd.*' injectargs --osd_max_scrubs=2


3. Check LA and increase osd_scrub_load_threshold

ceph tell 'osd.*' injectargs --osd_scrub_load_threshold=5


4. Force deep scrub tasks

ceph health detail | grep "not deep-scrubbed since" | awk '{ print $2 }' | xargs -n1 ceph pg deep-scrub