On Mon, Dec 09, 2019 at 05:01:04PM -0800, Paul Choi wrote:
Hello,
Anybody seeing the Prometheus endpoint hanging with the new 13.2.7
release?
With 13.2.6 the endpoint would respond with a payload of 15MB in less
than 10 seconds.
I'd guess its not the prometheus module itself:
$ git diff v13.2.6 v13.2.7 src/pybind/mgr/prometheus
diff --git a/src/pybind/mgr/prometheus/module.py
b/src/pybind/mgr/prometheus/module.py
index 2d4472434a..3a398e0b0c 100644
--- a/src/pybind/mgr/prometheus/module.py
+++ b/src/pybind/mgr/prometheus/module.py
@@ -142,7 +142,8 @@ class Metric(object):
def promethize(path):
''' replace illegal metric name characters '''
- result = path.replace('.', '_').replace('+',
'_plus').replace('::', '_')
+ result = path.replace('.', '_').replace(
+ '+', '_plus').replace('::',
'_').replace(' ', '_')
# Hyphens usually turn into underscores, unless they are
# trailing
@@ -720,7 +721,8 @@ class Module(MgrModule):
raise cherrypy.HTTPError(503, 'No MON connection')
# Make the cache timeout for collecting configurable
- self.collect_timeout = self.get_localized_config('scrape_interval', 5.0)
+ self.collect_timeout = float(self.get_localized_config(
+ 'scrape_interval', 5.0))
server_addr = self.get_localized_config('server_addr', DEFAULT_ADDR)
server_port = self.get_localized_config('server_port', DEFAULT_PORT)
So the mgr would be a likely suspect. If you could open a tracker ticket,
ideally with mgr debug logs attached, this can be looked at.
Now, if you restart ceph-mgr, the Prometheus
endpoint responds quickly
for the first run, then successive runs get slower and slower, until it
takes several minutes.
I have no customization for the mgr module. Except for the Prometheus
module, the "status" module and "Zabbix" module are working fine.
This is on Ubuntu 16 LTS:
ii ceph-mgr 13.2.7-1xenial
amd64 manager for the ceph distributed storage
system
I'd love to know if there's a way to diagnose this issue - I tried
upping the debug ms level but that doesn't seem to yield useful
information.
I don't know if this useful, but "prometheus self-test" is fine too.
$ sudo ceph tell mgr.0 prometheus self-test
Self-test OK
pchoi@pchoi-desktop:~$ ceph mgr module ls
{
"enabled_modules": [
"prometheus",
"status",
"zabbix"
],
"disabled_modules": [
{
"name": "balancer",
"can_run": true,
"error_string": ""
},
{
"name": "crash",
"can_run": true,
"error_string": ""
},
{
"name": "dashboard",
"can_run": true,
"error_string": ""
},
{
"name": "hello",
"can_run": true,
"error_string": ""
},
{
"name": "influx",
"can_run": false,
"error_string": "influxdb python module not found"
},
{
"name": "iostat",
"can_run": true,
"error_string": ""
},
{
"name": "localpool",
"can_run": true,
"error_string": ""
},
{
"name": "restful",
"can_run": true,
"error_string": ""
},
{
"name": "selftest",
"can_run": true,
"error_string": ""
},
{
"name": "smart",
"can_run": true,
"error_string": ""
},
{
"name": "telegraf",
"can_run": true,
"error_string": ""
},
{
"name": "telemetry",
"can_run": true,
"error_string": ""
}
]
}
pchoi@pchoi-desktop:~$ ceph mgr services
{
"prometheus": "[1]http://woodenbox2:9283/"
}
References
1.
http://woodenbox2:9283/
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
--
Jan Fajerski
Senior Software Engineer Enterprise Storage
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer