[ceph-users] Re: Huge HDD ceph monitor usage [EXT]

29 Oct 2020

Thanks for response...

I dont have the old OSDs (and not backups because this cluster is not so 
important, this is the develop cluster, so the unknown PGs i need to 
delete it (how i can do that?). But i dont want wipe all the Ceph 
cluster, if i can delete the unkown and incomplete PGs, well some data 
will be losted, but not all i think.

i will do that, minimize the replicates copies and stabilize.

El 2020-10-29 13:11, Frank Schilder escribió:
>> ... i will use now only one site, but need first stabilice the
>> cluster to remove the EC erasure coding and use replicate ...
> 
> If you change to one site only, there is no point in getting rid of
> the EC pool. Your main problem will be restoring the lost data. Do you
> have backup of everything? Do you still have the old OSDs? You never
> answered these questions.
> 
> To give you an idea why this is important, with ceph, loosing 1% of
> data on an rbd pool does *not* mean you loose 1% of the disks. It
> means that, on average, every disk looses 1% of its blocks. In other
> words, getting everything up again will be a lot of work either way.
> 
> The best path to follow is what Eugen suggested: add mons to have at
> least 3 and dig out the old disks to be able to export and import PGs.
> Look at Eugen's last 2 e-mails, its a starting point. You might be
> able to recover more by reducing temporarily min_size to 1 on the
> replicated pools and to 4 on the EC pool. If possible, make sure there
> is no client access during that time. The missing rest needs to be
> scraped off the OSDs you deleted from the cluster.
> 
> If you have backup of everything, starting from scratch and populating
> the ceph cluster from backup might be the fastest option.
> 
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> 
> ________________________________________
> From: Eugen Block &lt;eblock(a)nde.ag&gt;
> Sent: 28 October 2020 07:23:09
> To: Ing. Luis Felipe Domínguez Vega
> Cc: Ceph Users
> Subject: [ceph-users] Re: Huge HDD ceph monitor usage [EXT]
> 
> If you have that many spare hosts I would recommend to deploy two more
> MONs on them, and probably also additional MGRs so they can failover.
> 
> What is the EC profile for the data_storage pool?
> 
> Can you also share
> 
> ceph pg dump pgs | grep -v "active+clean"
> 
> to see which PGs are affected.
> The remaining issue with unfound objects and unkown PGs could be
> because you removed OSDs. That could mean data loss, but maybe there's
> a chance to recover anyway.
> 
> 
> Zitat von "Ing. Luis Felipe Domínguez Vega"
&lt;luis.dominguez(a)desoft.cu&gt;cu>:
> 
>> Well recovering not working yet... i was started 6 servers more and
>> the cluster not yet recovered.
>> Ceph status not show any recover progress
>> 
>> ceph -s                 : https://pastebin.ubuntu.com/p/zRQPbvGzbw/
>> ceph osd tree           : https://pastebin.ubuntu.com/p/sTDs8vd7Sk/
>> ceph osd df             : https://pastebin.ubuntu.com/p/ysbh8r2VVz/
>> ceph osd pool ls detail : https://pastebin.ubuntu.com/p/GRdPjxhv3D/
>> crush rules             : (ceph osd crush rule dump)
>> https://pastebin.ubuntu.com/p/cjyjmbQ4Wq/
>> 
>> El 2020-10-27 09:59, Eugen Block escribió:
>>> Your pool 'data_storage' has a size of 7 (or 7 chunks since it's
>>> erasure-coded) and the rule requires each chunk on a different host
>>> but you currently have only 5 hosts available, that's why the 
>>> recovery
>>> is not progressing. It's waiting for two more hosts. Unfortunately,
>>> you can't change the EC profile or the rule of that pool. I'm not 
>>> sure
>>> if it would work in the current cluster state, but if you can't add
>>> two more hosts (which would be your best option for recovery) it 
>>> might
>>> be possible to create a new replicated pool (you seem to have enough
>>> free space) and copy the contents from that EC pool. But as I said,
>>> I'm not sure if that would work in a degraded state, I've never
tried
>>> that.
>>> 
>>> So your best bet is to get two more hosts somehow.
>>> 
>>> 
>>>> pool 4 'data_storage' erasure profile desoft size 7 min_size 5
>>>> crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32
>>>> autoscale_mode off last_change 154384 lfor 0/121016/121014 flags
>>>> hashpspool,ec_overwrites,selfmanaged_snaps stripe_width 16384
>>>> application rbd
>>> 
>>> 
>>> Zitat von "Ing. Luis Felipe Domínguez Vega" 
>>> &lt;luis.dominguez(a)desoft.cu&gt;cu>:
>>> 
>>>> Needed data:
>>>> 
>>>> ceph -s                 : https://pastebin.ubuntu.com/p/S9gKjyZtdK/
>>>> ceph osd tree           : https://pastebin.ubuntu.com/p/SCZHkk6Mk4/
>>>> ceph osd df             : (later, because i'm waiting since 10
>>>> minutes and not output yet)
>>>> ceph osd pool ls detail : https://pastebin.ubuntu.com/p/GRdPjxhv3D/
>>>> crush rules             : (ceph osd crush rule dump)
>>>> https://pastebin.ubuntu.com/p/cjyjmbQ4Wq/
>>>> 
>>>> El 2020-10-27 07:14, Eugen Block escribió:
>>>>>> I understand, but i delete the OSDs from CRUSH map, so ceph
>>>>>> don't   wait for these OSDs, i'm right?
>>>>> 
>>>>> It depends on your actual crush tree and rules. Can you share 
>>>>> (maybe
>>>>> you already did)
>>>>> 
>>>>> ceph osd tree
>>>>> ceph osd df
>>>>> ceph osd pool ls detail
>>>>> 
>>>>> and a dump of your crush rules?
>>>>> 
>>>>> As I already said, if you have rules in place that distribute data
>>>>> across 2 DCs and one of them is down the PGs will never recover 
>>>>> even
>>>>> if you delete the OSDs from the failed DC.
>>>>> 
>>>>> 
>>>>> 
>>>>> Zitat von "Ing. Luis Felipe Domínguez Vega" 
>>>>> &lt;luis.dominguez(a)desoft.cu&gt;cu>:
>>>>> 
>>>>>> I understand, but i delete the OSDs from CRUSH map, so ceph
>>>>>> don't   wait for these OSDs, i'm right?
>>>>>> 
>>>>>> El 2020-10-27 04:06, Eugen Block escribió:
>>>>>>> Hi,
>>>>>>> 
>>>>>>> just to clarify so I don't miss anything: you have two
DCs and 
>>>>>>> one of
>>>>>>> them is down. And two of the MONs were in that failed DC? Now
you
>>>>>>> removed all OSDs and two MONs from the failed DC hoping that
your
>>>>>>> cluster will recover? If you have reasonable crush rules in
place
>>>>>>> (e.g. to recover from a failed DC) your cluster will never 
>>>>>>> recover in
>>>>>>> the current state unless you bring OSDs back up on the second
DC.
>>>>>>> That's why you don't see progress in the recovery
process, the 
>>>>>>> PGs are
>>>>>>> waiting for their peers in the other DC so they can follow
the 
>>>>>>> crush
>>>>>>> rules.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Eugen
>>>>>>> 
>>>>>>> 
>>>>>>> Zitat von "Ing. Luis Felipe Domínguez Vega" 
>>>>>>> &lt;luis.dominguez(a)desoft.cu&gt;cu>:
>>>>>>> 
>>>>>>>> I was 3 mons, but i have 2 physical datacenters, one of
them
>>>>>>>> breaks  with not short term fix, so i remove all osds and
ceph
>>>>>>>>  mon  (2 of  them) and now i have only the osds of 1
>>>>>>>> datacenter  with the  monitor.  I was stopped the ceph
>>>>>>>> manager, but i was  see that when  i restart a  ceph
manager
>>>>>>>> then ceph -s show  recovering info for a  short term of 
20
>>>>>>>> min more or less, then  dissapear all info.
>>>>>>>> 
>>>>>>>> The thing is that sems the cluster is not self recovering
and
>>>>>>>> the   ceph monitor is "eating" all of the HDD.
>>>>>>>> 
>>>>>>>> El 2020-10-26 15:57, Eugen Block escribió:
>>>>>>>>> The recovery process (ceph -s) is independent of the
MGR 
>>>>>>>>> service but
>>>>>>>>> only depends on the MON service. It seems you only
have the one 
>>>>>>>>> MON,
>>>>>>>>> if the MGR is overloading it (not clear why) it could
help to 
>>>>>>>>> leave
>>>>>>>>> MGR off and see if the MON service then has enough
RAM to 
>>>>>>>>> proceed with
>>>>>>>>> the recovery. Do you have any chance to add two more
MONs? A 
>>>>>>>>> single
>>>>>>>>> MON is of course a single point of failure.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Zitat von "Ing. Luis Felipe Domínguez
Vega"
>>>>>>>>> &lt;luis.dominguez(a)desoft.cu&gt;cu>:
>>>>>>>>> 
>>>>>>>>>> El 2020-10-26 15:16, Eugen Block escribió:
>>>>>>>>>>> You could stop the MGRs and wait for the
recovery to
>>>>>>>>>>> finish, MGRs are
>>>>>>>>>>> not a critical component. You won’t have a
dashboard or 
>>>>>>>>>>> metrics
>>>>>>>>>>> during/of that time but it would prevent the
high RAM usage.
>>>>>>>>>>> 
>>>>>>>>>>> Zitat von "Ing. Luis Felipe Domínguez
Vega"
>>>>>>>>>>> &lt;luis.dominguez(a)desoft.cu&gt;cu>:
>>>>>>>>>>> 
>>>>>>>>>>>> El 2020-10-26 12:23, 胡 玮文 escribió:
>>>>>>>>>>>>>> 在 2020年10月26日，23:29，Ing. Luis
Felipe Domínguez Vega
>>>>>>>>>>>>>> &lt;luis.dominguez(a)desoft.cu&gt;
写道：
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> mgr: fond-beagle(active, since
39s)
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Your manager seems crash looping, it
only started since
>>>>>>>>>>>>> 39s. Looking
>>>>>>>>>>>>> at mgr logs may help you identify why
your cluster is not
>>>>>>>>>>>>>  recovering.
>>>>>>>>>>>>> You may hit some bug in mgr.
>>>>>>>>>>>> Noup, I'm restarting the ceph manager
because they eat all
>>>>>>>>>>>>    server   RAM and then i have an script
that when i have
>>>>>>>>>>>> 1GB  of   Free Ram  (the  server has 94
Gb of RAM) then
>>>>>>>>>>>> restart  the   manager, i dont  known why
 and the logs of
>>>>>>>>>>>> manager are:
>>>>>>>>>>>> 
>>>>>>>>>>>> -----------------------------------
>>>>>>>>>>>>
root@fond-beagle:/var/lib/ceph/mon/ceph-fond-beagle/store.db#
>>>>>>>>>>>> tail    -f
/var/log/ceph/ceph-mgr.fond-beagle.log
>>>>>>>>>>>> 2020-10-26T12:54:12.497-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   log [DBG] : pgmap
v584: 2305 pgs: 4
>>>>>>>>>>>>      active+undersized+degraded+remapped,
4
>>>>>>>>>>>>
active+recovery_unfound+undersized+degraded+remapped, 2104
>>>>>>>>>>>>      active+clean, 5
active+undersized+degraded, 34
>>>>>>>>>>>> incomplete,  154     unknown; 1.7 TiB
data, 2.9 TiB used,
>>>>>>>>>>>> 21 TiB / 24 TiB  avail;    
347248/2606900 objects
>>>>>>>>>>>> degraded (13.320%);  107570/2606900  
objects   misplaced
>>>>>>>>>>>> (4.126%); 19/404328  objects unfound
(0.005%)
>>>>>>>>>>>> 2020-10-26T12:54:12.497-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   do_log log to
syslog
>>>>>>>>>>>> 2020-10-26T12:54:14.501-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   log [DBG] : pgmap
v585: 2305 pgs: 4
>>>>>>>>>>>>      active+undersized+degraded+remapped,
4
>>>>>>>>>>>>
active+recovery_unfound+undersized+degraded+remapped, 2104
>>>>>>>>>>>>      active+clean, 5
active+undersized+degraded, 34
>>>>>>>>>>>> incomplete,  154     unknown; 1.7 TiB
data, 2.9 TiB used,
>>>>>>>>>>>> 21 TiB / 24 TiB  avail;    
347248/2606900 objects
>>>>>>>>>>>> degraded (13.320%);  107570/2606900  
objects   misplaced
>>>>>>>>>>>> (4.126%); 19/404328  objects unfound
(0.005%)
>>>>>>>>>>>> 2020-10-26T12:54:14.501-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   do_log log to
syslog
>>>>>>>>>>>> 2020-10-26T12:54:16.517-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   log [DBG] : pgmap
v586: 2305 pgs: 4
>>>>>>>>>>>>      active+undersized+degraded+remapped,
4
>>>>>>>>>>>>
active+recovery_unfound+undersized+degraded+remapped, 2104
>>>>>>>>>>>>      active+clean, 5
active+undersized+degraded, 34
>>>>>>>>>>>> incomplete,  154     unknown; 1.7 TiB
data, 2.9 TiB used,
>>>>>>>>>>>> 21 TiB / 24 TiB  avail;    
347248/2606900 objects
>>>>>>>>>>>> degraded (13.320%);  107570/2606900  
objects   misplaced
>>>>>>>>>>>> (4.126%); 19/404328  objects unfound
(0.005%)
>>>>>>>>>>>> 2020-10-26T12:54:16.517-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   do_log log to
syslog
>>>>>>>>>>>> 2020-10-26T12:54:18.521-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   log [DBG] : pgmap
v587: 2305 pgs: 4
>>>>>>>>>>>>      active+undersized+degraded+remapped,
4
>>>>>>>>>>>>
active+recovery_unfound+undersized+degraded+remapped, 2104
>>>>>>>>>>>>      active+clean, 5
active+undersized+degraded, 34
>>>>>>>>>>>> incomplete,  154     unknown; 1.7 TiB
data, 2.9 TiB used,
>>>>>>>>>>>> 21 TiB / 24 TiB  avail;    
347248/2606900 objects
>>>>>>>>>>>> degraded (13.320%);  107570/2606900  
objects   misplaced
>>>>>>>>>>>> (4.126%); 19/404328  objects unfound
(0.005%)
>>>>>>>>>>>> 2020-10-26T12:54:18.521-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   do_log log to
syslog
>>>>>>>>>>>> 2020-10-26T12:54:20.537-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   log [DBG] : pgmap
v588: 2305 pgs: 4
>>>>>>>>>>>>      active+undersized+degraded+remapped,
4
>>>>>>>>>>>>
active+recovery_unfound+undersized+degraded+remapped, 2104
>>>>>>>>>>>>      active+clean, 5
active+undersized+degraded, 34
>>>>>>>>>>>> incomplete,  154     unknown; 1.7 TiB
data, 2.9 TiB used,
>>>>>>>>>>>> 21 TiB / 24 TiB  avail;    
347248/2606900 objects
>>>>>>>>>>>> degraded (13.320%);  107570/2606900  
objects   misplaced
>>>>>>>>>>>> (4.126%); 19/404328  objects unfound
(0.005%)
>>>>>>>>>>>> 2020-10-26T12:54:20.537-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   do_log log to
syslog
>>>>>>>>>>>> 2020-10-26T12:54:22.541-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   log [DBG] : pgmap
v589: 2305 pgs: 4
>>>>>>>>>>>>      active+undersized+degraded+remapped,
4
>>>>>>>>>>>>
active+recovery_unfound+undersized+degraded+remapped, 2104
>>>>>>>>>>>>      active+clean, 5
active+undersized+degraded, 34
>>>>>>>>>>>> incomplete,  154     unknown; 1.7 TiB
data, 2.9 TiB used,
>>>>>>>>>>>> 21 TiB / 24 TiB  avail;    
347248/2606900 objects
>>>>>>>>>>>> degraded (13.320%);  107570/2606900  
objects   misplaced
>>>>>>>>>>>> (4.126%); 19/404328  objects unfound
(0.005%)
>>>>>>>>>>>> 2020-10-26T12:54:22.541-0400 7f2a8112b700
 0
>>>>>>>>>>>> log_channel(cluster)   do_log log to
syslog
>>>>>>>>>>>> ---------------
>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>> ceph-users mailing list --
ceph-users(a)ceph.io
>>>>>>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> ceph-users mailing list --
ceph-users(a)ceph.io
>>>>>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>>>>>>>>>> 
>>>>>>>>>> Ok i will do that... but the thing is that the
cluster not
>>>>>>>>>> show    recovering, not show that are doing
nothing, like to
>>>>>>>>>>  show the    recovering info on ceph -s command,
and then i
>>>>>>>>>> dont know if is    recovering or doing what?
> 
> 
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Huge HDD ceph monitor usage [EXT]