Sending this out to close the loop on this... (not filing a bug because I think the case is uncommon)

We were using 2 different Prometheus clients to scrape the metrics, while transitioning from one metrics system to another.
Turning off one of the clients - thus using just 1 - solved the issue.

On Mon, Dec 9, 2019 at 5:01 PM Paul Choi <pchoi@nuro.ai> wrote:
Hello,

Anybody seeing the Prometheus endpoint hanging with the new 13.2.7 release?
With 13.2.6 the endpoint would respond with a payload of 15MB in less than 10 seconds.

Now, if you restart ceph-mgr, the Prometheus endpoint responds quickly for the first run, then successive runs get slower and slower, until it takes several minutes.

I have no customization for the mgr module. Except for the Prometheus module, the "status" module and "Zabbix" module are working fine.

This is on Ubuntu 16 LTS:
ii  ceph-mgr                             13.2.7-1xenial                             amd64        manager for the ceph distributed storage system

I'd love to know if there's a way to diagnose this issue - I tried upping the debug ms level but that doesn't seem to yield useful information.

I don't know if this useful, but "prometheus self-test" is fine too.
$ sudo ceph tell mgr.0 prometheus self-test
Self-test OK

pchoi@pchoi-desktop:~$ ceph mgr module ls
{
    "enabled_modules": [
        "prometheus",
        "status",
        "zabbix"
    ],
    "disabled_modules": [
        {
            "name": "balancer",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "crash",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "dashboard",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "hello",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "influx",
            "can_run": false,
            "error_string": "influxdb python module not found"
        },
        {
            "name": "iostat",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "localpool",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "restful",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "selftest",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "smart",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "telegraf",
            "can_run": true,
            "error_string": ""
        },
        {
            "name": "telemetry",
            "can_run": true,
            "error_string": ""
        }
    ]
}
pchoi@pchoi-desktop:~$ ceph mgr services
{
    "prometheus": "http://woodenbox2:9283/"
}