How it started
One day I have tried to use my plex instance and it was unreachable. I started digging into it.
There were couple of containers that wouldn’t start and also inspection using k9s CLI was taking ages.
Started digging
I have started looking into logs, I saw couple of issues related to SQL - like slow queries.
Then I remembered that K3S is using SQLite as default database, which may not be ideal, for long living clusters.
I have checked out the size of it, and oh my, it WAS HUGE!
root@raspberrypi:/var/lib/rancher/k3s/server/db# du -hs state.db
25G state.db
The solution
I have been googling around a bit and found out, that you can pretty easily migrate from SQLite to embedded etcd.
The migration is done by adding --init-cluster argument to k3s and then restarting it.
How I did it?
systemctl stop k3s
vim /etc/systemd/system/k3s.service
I have added the --init-cluster argument to the ExecStart line, which resulted into something like this:
ExecStart=/usr/local/bin/k3s \
server \
'--disable=servicelb' \
'--disable=traefik' \
'--cluster-init' \
'--etcd-expose-metrics' \
...
Then I have restarted the server and the service:
sudo reboot
sudo systemctl start k3s
The start few seconds, if I compare it before, when debugging when I tried restarting the k3s instance, it was eating 4CPU cores and didnt boot to usable state after 10+ minutes. But after that the migration completed, k3s started running and all the containers were up and running.
This is the end :)