FYI, this is the ceph-exporter we're using at the moment:
https://github.com/digitalocean/ceph_exporter
It's not as good, but it does the job mostly. Some more specific metrics
are missing, but the majority is there.
On 10/12/2020 19:01, Janek Bevendorff wrote:
> Do you have the prometheus module enabled? Turn that off, it's causing
> issues. I replaced it with another ceph exporter from Github and
> almost forgot about it.
>
> Here's the relevant issue report:
>
https://tracker.ceph.com/issues/39264#change-179946
>
> On 10/12/2020 16:43, Welby McRoberts wrote:
>> Hi Folks
>>
>> We've noticed that in a cluster of 21 nodes (5 mgrs&mons & 504 OSDs
>> with 24
>> per node) that the mgr's are, after a non specific period of time,
>> dropping
>> out of the cluster. The logs only show the following:
>>
>> debug 2020-12-10T02:02:50.409+0000 7f1005840700 0
>> log_channel(cluster) log
>> [DBG] : pgmap v14163: 4129 pgs: 4129 active+clean; 10 GiB data, 31 TiB
>> used, 6.3 PiB / 6.3 PiB avail
>> debug 2020-12-10T03:20:59.223+0000 7f10624eb700 -1 monclient:
>> _check_auth_rotating possible clock skew, rotating keys expired way too
>> early (before 2020-12-10T02:20:59.226159+0000)
>> debug 2020-12-10T03:21:00.223+0000 7f10624eb700 -1 monclient:
>> _check_auth_rotating possible clock skew, rotating keys expired way too
>> early (before 2020-12-10T02:21:00.226310+0000)
>>
>> The _check_auth_rotating repeats approximately every second. The
>> instances
>> are all syncing their time with NTP and have no issues on that front. A
>> restart of the mgr fixes the issue.
>>
>> It appears that this may be related to
>>
https://tracker.ceph.com/issues/39264.
>> The suggestion seems to be to disable prometheus metrics, however, this
>> obviously isn't realistic for a production environment where metrics are
>> critical for operations.
>>
>> Please let us know what additional information we can provide to
>> assist in
>> resolving this critical issue.
>>
>> Cheers
>> Welby
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io