Hi,
On 05.05.21 11:07, Andres Rojas Guerrero wrote:
Sorry, I have not understood the problem well, the
problem I see is that
once the OSD fails, the cluster recovers but the MDS remains faulty:
*snipsnap*
pgs: 1.562% pgs not active
16128 active+clean
238 incomplete
18 down
The PGs in down and incomplete state will not allow any I/O, and this
leads to the slow ops and the unavailability of the services. 32 OSDs
are currently down; if PG replicates are spread over these OSDs only
there will be no automatic recover.
You will have to bring the OSDs back online to allow recovery. Are those
located on a single node or are multiple hosts involved?
Regards,
Burkhard