On Fri, Feb 21, 2020 at 05:28:12AM -0000, alexander.v.litvak(a)gmail.com wrote:
This evening I was awakened by an error message
cluster:
id: 9b4468b7-5bf2-4964-8aec-4b2f4bee87ad
health: HEALTH_ERR
Module 'telemetry' has failed: ('Connection aborted.',
error(101, 'Network is unreachable'))
services:
I have not seen any other problems with anything else on the cluster. I disabled and
enabled the telemetry module and health returned to OK status. Any ideas on what could
cause the issue? As far as I understand, telemetry is a module that sends messages to an
external ceph server outside of the network.
Maybe an uplink issue? We had similar behaviour as we had some trouble with a
core router.
You have been able to disable and enable the module? This failed for me with the
reason that the module had failed (Nautilus). Restarting all mgrs did help.
Still - I'm not sure why this is considered to be a health_err.
Best regards
Thore