Services are operational again and performance has recovered.
We conclude that the problem originated from three factors: We deleted a lot of large abusive content recently. Automated snapshot trimming started today and deleted more data than usual. This in turn resulted in a high I/O load on our last remaining HDD that consequently brought down a host and some Ceph services (MDS), bringing performance to a near-halt for a while.