Throughput metrics missing iwhen updating Ceph Quincy to Reef

List overview All Threads
Download

newer

older

Re: RBD Image Returning 'Unknown...

How can I clone data from a faulty...

Jose Vicente

18 Jan 2024 18 Jan '24

11:32 a.m.

Attachments:

attachment.html (text/html — 2.1 KB)

Show replies by date

Martin

24 Jan 24 Jan

12:28 p.m.

Hi, Confirmed that this happens to me as well. After upgrading from 18.2.0 to 18.2.1 OSD metrics like: ceph_osd_op_* are missing from ceph-mgr. The Grafana dashboard also doesn't display all graphs correctly. ceph-dashboard/Ceph - Cluster : Capacity used, Cluster I/O, OSD Capacity Utilization, PGs per OSD.... curl http://localhost:9283/metrics | grep -i ceph_osd_op % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 38317 100 38317 0 0 9.8M 0 --:--:-- --:--:-- --:--:-- 12.1M Before the upgrading to reef 18.2.1 I could get all the metrics. Martin On 18/01/2024 12:32, Jose Vicente wrote: > Hi, > After upgrading from Quincy to Reef the ceph-mgr daemon is not > throwing some throughput OSD metrics like: ceph_osd_op_* > curl http://localhost:9283/metrics | grep -i ceph_osd_op > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 295k 100 295k 0 0 144M 0 --:--:-- --:--:-- > --:--:-- 144M > However I can get other metrics like: > # curl http://localhost:9283/metrics | grep -i ceph_osd_apply > # HELP ceph_osd_apply_latency_ms OSD stat apply_latency_ms > # TYPE ceph_osd_apply_latency_ms gauge > ceph_osd_apply_latency_ms{ceph_daemon="osd.275"} 152.0 > ceph_osd_apply_latency_ms{ceph_daemon="osd.274"} 102.0 > ... > Before the upgrading to reef (from quincy) I I could get all the > metrics. MGR module prometheus is enabled. > Rocky Linux release 8.8 (Green Obsidian) > ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef > (stable) > # netstat -nap | grep 9283 > tcp 0 0 127.0.0.1:53834 127.0.0.1:9283 > ESTABLISHED 3561/prometheus > tcp6 0 0 :::9283 :::* LISTEN > 804985/ceph-mgr > Thanks, > Jose C. > > _______________________________________________ > ceph-users mailing list --ceph-users(a)ceph.io > To unsubscribe send an email toceph-users-leave(a)ceph.io

Eugen Block

25 Jan 25 Jan

11:06 p.m.

...

Hi, After upgrading from Quincy to Reef the ceph-mgr daemon is not throwing some throughput OSD metrics like: ceph_osd_op_* curl http://localhost:9283/metrics | grep -i ceph_osd_op % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 295k 100 295k 0 0 144M 0 --:--:-- --:--:-- --:--:-- 144M However I can get other metrics like: # curl http://localhost:9283/metrics | grep -i ceph_osd_apply # HELP ceph_osd_apply_latency_ms OSD stat apply_latency_ms # TYPE ceph_osd_apply_latency_ms gauge ceph_osd_apply_latency_ms{ceph_daemon="osd.275"} 152.0 ceph_osd_apply_latency_ms{ceph_daemon="osd.274"} 102.0 ... Before the upgrading to reef (from quincy) I I could get all the metrics. MGR module prometheus is enabled. Rocky Linux release 8.8 (Green Obsidian) ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) # netstat -nap | grep 9283 tcp 0 0 127.0.0.1:53834 127.0.0.1:9283 ESTABLISHED 3561/prometheus tcp6 0 0 :::9283 :::* LISTEN 804985/ceph-mgr Thanks, Jose C. _______________________________________________ ceph-users mailing list --ceph-users(a)ceph.io To unsubscribe send an email toceph-users-leave(a)ceph.io

_______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Eugen Block

11:14 p.m.

Yeah, it's mentioned in the upgrade docs [2]:

...

Monitoring & Alerting Ceph-exporter: Now the performance metrics for Ceph daemons are exported by ceph-exporter, which deploys on each daemon rather than using prometheus exporter. This will reduce performance bottlenecks.

[2] https://docs.ceph.com/en/latest/releases/reef/#major-changes-from-quincy Zitat von Eugen Block <eblock(a)nde.ag>ag>:

...

Hi, I got those metrics back after setting: reef01:~ # ceph config set mgr mgr/prometheus/exclude_perf_counters false reef01:~ # curl http://localhost:9283/metrics | grep ceph_osd_op | head % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 324k 100 324k 0 0 72.5M 0 --:--:-- --:--:-- --:--:-- 79.1M # HELP ceph_osd_op Client operations # TYPE ceph_osd_op counter ceph_osd_op{ceph_daemon="osd.0"} 139650.0 ceph_osd_op{ceph_daemon="osd.11"} 9711090.0 ceph_osd_op{ceph_daemon="osd.2"} 3864.0 ceph_osd_op{ceph_daemon="osd.1"} 25.0 ceph_osd_op{ceph_daemon="osd.4"} 543.0 ceph_osd_op{ceph_daemon="osd.5"} 12192.0 ceph_osd_op{ceph_daemon="osd.3"} 3661521.0 ceph_osd_op{ceph_daemon="osd.6"} 2030.0 I found the option in the docs [1], but the same section is in the quincy docs as well, although there's no such option in my quincy cluster, maybe that's why it still exports those performance counters in my quincy cluster: quincy-1:~ # ceph config get mgr mgr/prometheus/exclude_perf_counters Error ENOENT: unrecognized key 'mgr/prometheus/exclude_perf_counters' Anyway, this should bring back the metrics the "legacy" way (I guess). Apparently, the ceph-exporter daemon is now required on your hosts to collect those metrics. After adding the ceph-exporter service (ceph orch apply ceph-exporter) and setting mgr/prometheus/exclude_perf_counters back to "true" I see that there are "ceph_osd_op" metrics defined but no values yet. Apparently, I'm still missing something, I'll check tomorrow. But this could/should be in the upgrade docs IMO. Regards, Eugen [1] https://docs.ceph.com/en/latest/mgr/prometheus/#ceph-daemon-performance-cou… Zitat von Martin <ceph(a)firma.azet.sk>sk>: > Hi, > > Confirmed that this happens to me as well. > After upgrading from 18.2.0 to 18.2.1 OSD metrics > like: ceph_osd_op_* are missing from ceph-mgr. > > The Grafana dashboard also doesn't display all graphs correctly. > > ceph-dashboard/Ceph - Cluster : Capacity used, Cluster I/O, OSD > Capacity Utilization, PGs per OSD.... > > curl http://localhost:9283/metrics | grep -i ceph_osd_op > % Total % Received % Xferd Average Speed Time Time Time Current > Dload Upload Total Spent Left Speed > 100 38317 100 38317 0 0 9.8M 0 --:--:-- --:--:-- > --:--:-- 12.1M > > Before the upgrading to reef 18.2.1 I could get all the metrics. > > Martin > > On 18/01/2024 12:32, Jose Vicente wrote: >> Hi, >> After upgrading from Quincy to Reef the ceph-mgr daemon is not >> throwing some throughput OSD metrics like: ceph_osd_op_* >> curl http://localhost:9283/metrics | grep -i ceph_osd_op >> % Total % Received % Xferd Average Speed Time Time >> Time Current >> Dload Upload Total Spent Left Speed >> 100 295k 100 295k 0 0 144M 0 --:--:-- --:--:-- >> --:--:-- 144M >> However I can get other metrics like: >> # curl http://localhost:9283/metrics | grep -i ceph_osd_apply >> # HELP ceph_osd_apply_latency_ms OSD stat apply_latency_ms >> # TYPE ceph_osd_apply_latency_ms gauge >> ceph_osd_apply_latency_ms{ceph_daemon="osd.275"} 152.0 >> ceph_osd_apply_latency_ms{ceph_daemon="osd.274"} 102.0 >> ... >> Before the upgrading to reef (from quincy) I I could get all the >> metrics. MGR module prometheus is enabled. >> Rocky Linux release 8.8 (Green Obsidian) >> ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) >> reef (stable) >> # netstat -nap | grep 9283 >> tcp 0 0 127.0.0.1:53834 127.0.0.1:9283 >> ESTABLISHED 3561/prometheus >> tcp6 0 0 :::9283 :::* LISTEN >> 804985/ceph-mgr >> Thanks, >> Jose C. >> >> _______________________________________________ >> ceph-users mailing list --ceph-users(a)ceph.io >> To unsubscribe send an email toceph-users-leave(a)ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

Eugen Block

11:17 p.m.

Ah, there they are (different port): reef01:~ # curl http://localhost:9926/metrics | grep ceph_osd_op | head % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 124k 100 124k 0 0 111M 0 --:--:-- --:--:-- --:--:-- 121M # HELP ceph_osd_op Client operations # TYPE ceph_osd_op counter ceph_osd_op{ceph_daemon="osd.1"} 25 ceph_osd_op{ceph_daemon="osd.4"} 543 ceph_osd_op{ceph_daemon="osd.5"} 12192 # HELP ceph_osd_op_delayed_degraded Count of ops delayed due to target object being degraded # TYPE ceph_osd_op_delayed_degraded counter ceph_osd_op_delayed_degraded{ceph_daemon="osd.1"} 0 ceph_osd_op_delayed_degraded{ceph_daemon="osd.4"} 0 ceph_osd_op_delayed_degraded{ceph_daemon="osd.5"} 0 I can't check the dashboard right now, that I will definitely do tomorrow. Good night! Zitat von Eugen Block <eblock(a)nde.ag>ag>:

...

Yeah, it's mentioned in the upgrade docs [2]:

[2] https://docs.ceph.com/en/latest/releases/reef/#major-changes-from-quincy Zitat von Eugen Block <eblock(a)nde.ag>ag>: > Hi, > > I got those metrics back after setting: > > reef01:~ # ceph config set mgr mgr/prometheus/exclude_perf_counters false > > reef01:~ # curl http://localhost:9283/metrics | grep ceph_osd_op | head > % Total % Received % Xferd Average Speed Time Time > Time Current > Dload Upload Total Spent Left Speed > 100 324k 100 324k 0 0 72.5M 0 --:--:-- --:--:-- > --:--:-- 79.1M > # HELP ceph_osd_op Client operations > # TYPE ceph_osd_op counter > ceph_osd_op{ceph_daemon="osd.0"} 139650.0 > ceph_osd_op{ceph_daemon="osd.11"} 9711090.0 > ceph_osd_op{ceph_daemon="osd.2"} 3864.0 > ceph_osd_op{ceph_daemon="osd.1"} 25.0 > ceph_osd_op{ceph_daemon="osd.4"} 543.0 > ceph_osd_op{ceph_daemon="osd.5"} 12192.0 > ceph_osd_op{ceph_daemon="osd.3"} 3661521.0 > ceph_osd_op{ceph_daemon="osd.6"} 2030.0 > > > I found the option in the docs [1], but the same section is in the > quincy docs as well, although there's no such option in my quincy > cluster, maybe that's why it still exports those performance > counters in my quincy cluster: > > quincy-1:~ # ceph config get mgr mgr/prometheus/exclude_perf_counters > Error ENOENT: unrecognized key 'mgr/prometheus/exclude_perf_counters' > > Anyway, this should bring back the metrics the "legacy" way (I > guess). Apparently, the ceph-exporter daemon is now required on > your hosts to collect those metrics. > After adding the ceph-exporter service (ceph orch apply > ceph-exporter) and setting mgr/prometheus/exclude_perf_counters > back to "true" I see that there are "ceph_osd_op" metrics defined > but no values yet. Apparently, I'm still missing something, I'll > check tomorrow. But this could/should be in the upgrade docs IMO. > > Regards, > Eugen > > [1] > https://docs.ceph.com/en/latest/mgr/prometheus/#ceph-daemon-performance-cou… > > Zitat von Martin <ceph(a)firma.azet.sk>sk>: > >> Hi, >> >> Confirmed that this happens to me as well. >> After upgrading from 18.2.0 to 18.2.1 OSD metrics >> like: ceph_osd_op_* are missing from ceph-mgr. >> >> The Grafana dashboard also doesn't display all graphs correctly. >> >> ceph-dashboard/Ceph - Cluster : Capacity used, Cluster I/O, OSD >> Capacity Utilization, PGs per OSD.... >> >> curl http://localhost:9283/metrics | grep -i ceph_osd_op >> % Total % Received % Xferd Average Speed Time Time >> Time Current >> Dload Upload Total Spent Left Speed >> 100 38317 100 38317 0 0 9.8M 0 --:--:-- --:--:-- >> --:--:-- 12.1M >> >> Before the upgrading to reef 18.2.1 I could get all the metrics. >> >> Martin >> >> On 18/01/2024 12:32, Jose Vicente wrote: >>> Hi, >>> After upgrading from Quincy to Reef the ceph-mgr daemon is not >>> throwing some throughput OSD metrics like: ceph_osd_op_* >>> curl http://localhost:9283/metrics | grep -i ceph_osd_op >>> % Total % Received % Xferd Average Speed Time Time >>> Time Current >>> Dload Upload Total Spent >>> Left Speed >>> 100 295k 100 295k 0 0 144M 0 --:--:-- --:--:-- >>> --:--:-- 144M >>> However I can get other metrics like: >>> # curl http://localhost:9283/metrics | grep -i ceph_osd_apply >>> # HELP ceph_osd_apply_latency_ms OSD stat apply_latency_ms >>> # TYPE ceph_osd_apply_latency_ms gauge >>> ceph_osd_apply_latency_ms{ceph_daemon="osd.275"} 152.0 >>> ceph_osd_apply_latency_ms{ceph_daemon="osd.274"} 102.0 >>> ... >>> Before the upgrading to reef (from quincy) I I could get all the >>> metrics. MGR module prometheus is enabled. >>> Rocky Linux release 8.8 (Green Obsidian) >>> ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) >>> reef (stable) >>> # netstat -nap | grep 9283 >>> tcp 0 0 127.0.0.1:53834 127.0.0.1:9283 >>> ESTABLISHED 3561/prometheus >>> tcp6 0 0 :::9283 :::* LISTEN >>> 804985/ceph-mgr >>> Thanks, >>> Jose C. >>> >>> _______________________________________________ >>> ceph-users mailing list --ceph-users(a)ceph.io >>> To unsubscribe send an email toceph-users-leave(a)ceph.io >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io

Martin

26 Jan 26 Jan

8:53 a.m.

Hi Eugen, Yes, you are right. After upgrade from v18.2.0 ---> v18.2.1 it is necessary to create the ceph-exporter service manually and deploy to all hosts. The dasboard is fine as well. Thanks for help. Martin On 26/01/2024 00:17, Eugen Block wrote: > Ah, there they are (different port): > > reef01:~ # curl http://localhost:9926/metrics | grep ceph_osd_op | head > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 124k 100 124k 0 0 111M 0 --:--:-- --:--:-- > --:--:-- 121M > # HELP ceph_osd_op Client operations > # TYPE ceph_osd_op counter > ceph_osd_op{ceph_daemon="osd.1"} 25 > ceph_osd_op{ceph_daemon="osd.4"} 543 > ceph_osd_op{ceph_daemon="osd.5"} 12192 > # HELP ceph_osd_op_delayed_degraded Count of ops delayed due to target > object being degraded > # TYPE ceph_osd_op_delayed_degraded counter > ceph_osd_op_delayed_degraded{ceph_daemon="osd.1"} 0 > ceph_osd_op_delayed_degraded{ceph_daemon="osd.4"} 0 > ceph_osd_op_delayed_degraded{ceph_daemon="osd.5"} 0 > > I can't check the dashboard right now, that I will definitely do > tomorrow. > Good night! > > Zitat von Eugen Block <eblock(a)nde.ag>ag>: > >> Yeah, it's mentioned in the upgrade docs [2]: >> >>> Monitoring & Alerting >>> Ceph-exporter: Now the performance metrics for Ceph daemons >>> are exported by ceph-exporter, which deploys on each daemon rather >>> than using prometheus exporter. This will reduce performance >>> bottlenecks. >> >> >> [2] >> https://docs.ceph.com/en/latest/releases/reef/#major-changes-from-quincy >> >> Zitat von Eugen Block <eblock(a)nde.ag>ag>: >> >>> Hi, >>> >>> I got those metrics back after setting: >>> >>> reef01:~ # ceph config set mgr mgr/prometheus/exclude_perf_counters >>> false >>> >>> reef01:~ # curl http://localhost:9283/metrics | grep ceph_osd_op | head >>> % Total % Received % Xferd Average Speed Time Time Time >>> Current >>> Dload Upload Total Spent Left >>> Speed >>> 100 324k 100 324k 0 0 72.5M 0 --:--:-- --:--:-- >>> --:--:-- 79.1M >>> # HELP ceph_osd_op Client operations >>> # TYPE ceph_osd_op counter >>> ceph_osd_op{ceph_daemon="osd.0"} 139650.0 >>> ceph_osd_op{ceph_daemon="osd.11"} 9711090.0 >>> ceph_osd_op{ceph_daemon="osd.2"} 3864.0 >>> ceph_osd_op{ceph_daemon="osd.1"} 25.0 >>> ceph_osd_op{ceph_daemon="osd.4"} 543.0 >>> ceph_osd_op{ceph_daemon="osd.5"} 12192.0 >>> ceph_osd_op{ceph_daemon="osd.3"} 3661521.0 >>> ceph_osd_op{ceph_daemon="osd.6"} 2030.0 >>> >>> >>> I found the option in the docs [1], but the same section is in the >>> quincy docs as well, although there's no such option in my quincy >>> cluster, maybe that's why it still exports those performance >>> counters in my quincy cluster: >>> >>> quincy-1:~ # ceph config get mgr mgr/prometheus/exclude_perf_counters >>> Error ENOENT: unrecognized key 'mgr/prometheus/exclude_perf_counters' >>> >>> Anyway, this should bring back the metrics the "legacy" way (I >>> guess). Apparently, the ceph-exporter daemon is now required on your >>> hosts to collect those metrics. >>> After adding the ceph-exporter service (ceph orch apply >>> ceph-exporter) and setting mgr/prometheus/exclude_perf_counters back >>> to "true" I see that there are "ceph_osd_op" metrics defined but no >>> values yet. Apparently, I'm still missing something, I'll check >>> tomorrow. But this could/should be in the upgrade docs IMO. >>> >>> Regards, >>> Eugen >>> >>> [1] >>> https://docs.ceph.com/en/latest/mgr/prometheus/#ceph-daemon-performance-cou… >>> >>> Zitat von Martin <ceph(a)firma.azet.sk>sk>: >>> >>>> Hi, >>>> >>>> Confirmed that this happens to me as well. >>>> After upgrading from 18.2.0 to 18.2.1 OSD metrics >>>> like: ceph_osd_op_* are missing from ceph-mgr. >>>> >>>> The Grafana dashboard also doesn't display all graphs correctly. >>>> >>>> ceph-dashboard/Ceph - Cluster : Capacity used, Cluster I/O, OSD >>>> Capacity Utilization, PGs per OSD.... >>>> >>>> curl http://localhost:9283/metrics | grep -i ceph_osd_op >>>> % Total % Received % Xferd Average Speed Time Time >>>> Time Current >>>> Dload Upload Total Spent >>>> Left Speed >>>> 100 38317 100 38317 0 0 9.8M 0 --:--:-- --:--:-- >>>> --:--:-- 12.1M >>>> >>>> Before the upgrading to reef 18.2.1 I could get all the metrics. >>>> >>>> Martin >>>> >>>> On 18/01/2024 12:32, Jose Vicente wrote: >>>>> Hi, >>>>> After upgrading from Quincy to Reef the ceph-mgr daemon is not >>>>> throwing some throughput OSD metrics like: ceph_osd_op_* >>>>> curl http://localhost:9283/metrics | grep -i ceph_osd_op >>>>> % Total % Received % Xferd Average Speed Time Time >>>>> Time Current >>>>> Dload Upload Total Spent >>>>> Left Speed >>>>> 100 295k 100 295k 0 0 144M 0 --:--:-- --:--:-- >>>>> --:--:-- 144M >>>>> However I can get other metrics like: >>>>> # curl http://localhost:9283/metrics | grep -i ceph_osd_apply >>>>> # HELP ceph_osd_apply_latency_ms OSD stat apply_latency_ms >>>>> # TYPE ceph_osd_apply_latency_ms gauge >>>>> ceph_osd_apply_latency_ms{ceph_daemon="osd.275"} 152.0 >>>>> ceph_osd_apply_latency_ms{ceph_daemon="osd.274"} 102.0 >>>>> ... >>>>> Before the upgrading to reef (from quincy) I I could get all the >>>>> metrics. MGR module prometheus is enabled. >>>>> Rocky Linux release 8.8 (Green Obsidian) >>>>> ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) >>>>> reef (stable) >>>>> # netstat -nap | grep 9283 >>>>> tcp 0 0 127.0.0.1:53834 127.0.0.1:9283 >>>>> ESTABLISHED 3561/prometheus >>>>> tcp6 0 0 :::9283 :::* LISTEN >>>>> 804985/ceph-mgr >>>>> Thanks, >>>>> Jose C. >>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list --ceph-users(a)ceph.io >>>>> To unsubscribe send an email toceph-users-leave(a)ceph.io >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users(a)ceph.io >>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

Eugen Block

9:11 a.m.

Yes, my dashboard looks good here as well. :-) Zitat von Martin <ceph(a)firma.azet.sk>sk>:

...

Yeah, it's mentioned in the upgrade docs [2]:

[2] https://docs.ceph.com/en/latest/releases/reef/#major-changes-from-quincy Zitat von Eugen Block <eblock(a)nde.ag>ag>: > Hi, > > I got those metrics back after setting: > > reef01:~ # ceph config set mgr mgr/prometheus/exclude_perf_counters false > > reef01:~ # curl http://localhost:9283/metrics | grep ceph_osd_op | head > % Total % Received % Xferd Average Speed Time Time > Time Current > Dload Upload Total Spent Left Speed > 100 324k 100 324k 0 0 72.5M 0 --:--:-- --:--:-- > --:--:-- 79.1M > # HELP ceph_osd_op Client operations > # TYPE ceph_osd_op counter > ceph_osd_op{ceph_daemon="osd.0"} 139650.0 > ceph_osd_op{ceph_daemon="osd.11"} 9711090.0 > ceph_osd_op{ceph_daemon="osd.2"} 3864.0 > ceph_osd_op{ceph_daemon="osd.1"} 25.0 > ceph_osd_op{ceph_daemon="osd.4"} 543.0 > ceph_osd_op{ceph_daemon="osd.5"} 12192.0 > ceph_osd_op{ceph_daemon="osd.3"} 3661521.0 > ceph_osd_op{ceph_daemon="osd.6"} 2030.0 > > > I found the option in the docs [1], but the same section is in > the quincy docs as well, although there's no such option in my > quincy cluster, maybe that's why it still exports those > performance counters in my quincy cluster: > > quincy-1:~ # ceph config get mgr mgr/prometheus/exclude_perf_counters > Error ENOENT: unrecognized key 'mgr/prometheus/exclude_perf_counters' > > Anyway, this should bring back the metrics the "legacy" way (I > guess). Apparently, the ceph-exporter daemon is now required on > your hosts to collect those metrics. > After adding the ceph-exporter service (ceph orch apply > ceph-exporter) and setting mgr/prometheus/exclude_perf_counters > back to "true" I see that there are "ceph_osd_op" metrics defined > but no values yet. Apparently, I'm still missing something, I'll > check tomorrow. But this could/should be in the upgrade docs IMO. > > Regards, > Eugen > > [1] > https://docs.ceph.com/en/latest/mgr/prometheus/#ceph-daemon-performance-cou… > > Zitat von Martin <ceph(a)firma.azet.sk>sk>: > >> Hi, >> >> Confirmed that this happens to me as well. >> After upgrading from 18.2.0 to 18.2.1 OSD metrics >> like: ceph_osd_op_* are missing from ceph-mgr. >> >> The Grafana dashboard also doesn't display all graphs correctly. >> >> ceph-dashboard/Ceph - Cluster : Capacity used, Cluster I/O, OSD >> Capacity Utilization, PGs per OSD.... >> >> curl http://localhost:9283/metrics | grep -i ceph_osd_op >> % Total % Received % Xferd Average Speed Time Time >> Time Current >> Dload Upload Total Spent >> Left Speed >> 100 38317 100 38317 0 0 9.8M 0 --:--:-- --:--:-- >> --:--:-- 12.1M >> >> Before the upgrading to reef 18.2.1 I could get all the metrics. >> >> Martin >> >> On 18/01/2024 12:32, Jose Vicente wrote: >>> Hi, >>> After upgrading from Quincy to Reef the ceph-mgr daemon is not >>> throwing some throughput OSD metrics like: ceph_osd_op_* >>> curl http://localhost:9283/metrics | grep -i ceph_osd_op >>> % Total % Received % Xferd Average Speed Time Time >>> Time Current >>> Dload Upload Total Spent >>> Left Speed >>> 100 295k 100 295k 0 0 144M 0 --:--:-- --:--:-- >>> --:--:-- 144M >>> However I can get other metrics like: >>> # curl http://localhost:9283/metrics | grep -i ceph_osd_apply >>> # HELP ceph_osd_apply_latency_ms OSD stat apply_latency_ms >>> # TYPE ceph_osd_apply_latency_ms gauge >>> ceph_osd_apply_latency_ms{ceph_daemon="osd.275"} 152.0 >>> ceph_osd_apply_latency_ms{ceph_daemon="osd.274"} 102.0 >>> ... >>> Before the upgrading to reef (from quincy) I I could get all >>> the metrics. MGR module prometheus is enabled. >>> Rocky Linux release 8.8 (Green Obsidian) >>> ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) >>> reef (stable) >>> # netstat -nap | grep 9283 >>> tcp 0 0 127.0.0.1:53834 127.0.0.1:9283 >>> ESTABLISHED 3561/prometheus >>> tcp6 0 0 :::9283 :::* LISTEN >>> 804985/ceph-mgr >>> Thanks, >>> Jose C. >>> >>> _______________________________________________ >>> ceph-users mailing list --ceph-users(a)ceph.io >>> To unsubscribe send an email toceph-users-leave(a)ceph.io >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io

_______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Christian Rohmann

1 Feb 1 Feb

9:10 a.m.

This change is documented at https://docs.ceph.com/en/latest/mgr/prometheus/#ceph-daemon-performance-cou…, also mentioning the deployment of ceph-exporter which is now used to collect per-host metrics from the local daemons. While this deployment is done by cephadm if used, I am wondering if ceph-exporter ([2] is also built and packaged via the ceph packages [3] for installations that use them? Regards Christian [1] https://docs.ceph.com/en/latest/mgr/prometheus/#ceph-daemon-performance-cou… [2] https://github.com/ceph/ceph/tree/main/src/exporter [3] https://docs.ceph.com/en/latest/install/get-packages/

Christian Rohmann

5 Feb 5 Feb

4:16 p.m.

On 01.02.24 10:10, Christian Rohmann wrote:

...

[...] I am wondering if ceph-exporter ([2] is also built and packaged via the ceph packages [3] for installations that use them? [2] https://github.com/ceph/ceph/tree/main/src/exporter [3] https://docs.ceph.com/en/latest/install/get-packages/

I could not find ceph-exporter in any of the packages or as single binary, so I opened an issue: https://tracker.ceph.com/issues/64317 Regards Christian

105

days inactive

123

days old

ceph-users@ceph.io

Manage subscription

8 comments

4 participants

tags (0)

participants (4)

Christian Rohmann
Eugen Block
Jose Vicente
Martin