Hello, I have updated my cluster from luminous to nautilus and now my
cluster is working but I am seeing a weird behavior in my monitors and
managers.
The monitors are using a huge amount of memory and becoming very slow. The
CPU usage is also very higher than it used to be.
The manager keeps constantly restarting and in many opportunities it report
all zero values in the pg numbers, space usage etc.
I also started receiving many slow ops warning from monitors.
Does any one has seen something like this?
Which info do I need to send about my cluster so I can get some help from
more experienced users?
This is one fo the weird log messages:
2020-03-15 06:26:39.474 7f0396cae700 -1 mon.cwvh2@0(leader) e27
get_health_metrics reporting 603796 slow ops, oldest is pool_op(create
unmanaged snap pool 5 tid 1236847 name v382060)
cluster:
id: 4ed305d4-5847-4e63-b9bb-168361cf2e81
health: HEALTH_WARN
1/3 mons down, quorum cwvh8,cwvh15
services:
mon: 3 daemons, quorum (age 4M), out of quorum: cwvh2, cwvh8, cwvh15
mgr: cwvh13(active, since 33s), standbys: cwvh14
osd: 100 osds: 100 up (since 17h), 100 in
data:
pools: 6 pools, 4160 pgs
objects: 19.35M objects, 64 TiB
usage: 132 TiB used, 94 TiB / 226 TiB avail
pgs: 4154 active+clean
6 active+clean+scrubbing+deep
io:
client: 192 MiB/s rd, 52 MiB/s wr, 1.25k op/s rd, 1.34k op/s wr
[root@cwvh15 ~]# ceph df
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
backup 120 TiB 62 TiB 58 TiB 58 TiB 48.64
hdd 12 TiB 4.3 TiB 7.8 TiB 7.8 TiB 64.47
ssd 94 TiB 28 TiB 66 TiB 66 TiB 70.40
TOTAL 226 TiB 94 TiB 132 TiB 132 TiB 58.51
POOLS:
POOL ID STORED OBJECTS USED %USED MAX
AVAIL
rbd 0 19 B 2 128 KiB 0 11
TiB
CWVHDS 1 420 GiB 110.03k 848 GiB 39.36 653
GiB
COMP 4 2.7 TiB 717.72k 5.5 TiB 81.19 653
GiB
SSD 5 35 TiB 9.43M 66 TiB 88.08 4.4
TiB
HDD 6 730 GiB 196.24k 1.4 TiB 52.84 653
GiB
BKPR1 8 33 TiB 8.90M 58 TiB 56.73 22
TiB
For Jewel I wrote a script to take the output of `ceph health detail
--format=json` and send alerts to our system that ordered the osds based on
how long the ops were blocked and which OSDs had the most ops blocked. This
was really helpful to quickly identify which OSD out of a list of 100 would
be the most probable one having issues. Since upgrading to Luminous, I
don't get that and I'm not sure where that info went to. Do I need to query
the manager now?
This is the regex I was using to extract the pertinent information:
'^(\d+) ops are blocked > (\d+\.+\d+) sec on osd\.(\d+)$'
Thanks,
Robert LeBlanc
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
Hello, we are running a ceph cluster + rgw on luminous 12.2.12 that serves as a S3 compatible storage. We have noticed some buckets where the `rgw.none` from the output of `radosgw-admin bucket stats` shows extremely large value for `num_objects`, which is not convincible. It does look like an underflow by subtracting a positive number from 0 and then the value is interpreted and shown as an uint64. For example,
```
# radosgw-admin bucket stats --bucket redacted
{
"bucket": "redacted",
...........
"usage": {
"rgw.none": {
"size": 0,
"size_actual": 0,
"size_utilized": 0,
"size_kb": 0,
"size_kb_actual": 0,
"size_kb_utilized": 0,
"num_objects": 18446744073709551607
},
"rgw.main": {
"size": 1687971465874,
"size_actual": 1696692400128,
"size_utilized": 1687971465874,
"size_kb": 1648409635,
"size_kb_actual": 1656926172,
"size_kb_utilized": 1648409635,
"num_objects": 4290147
},
"rgw.multimeta": {
"size": 0,
"size_actual": 0,
"size_utilized": 0,
"size_kb": 0,
"size_kb_actual": 0,
"size_kb_utilized": 0,
"num_objects": 75
}
},
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
}
```
We did find a few reports on this issue, eg. http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-November/037531.ht….
Do we know any use patterns that can lead object count to become that large? Also is there a way to accurately collect the object count for each bucket in the cluster, as we would like to use it for management purpose.
Awhile back I thought there were some limitations which prevented us
from trying this, but I cannot remember...
What does the ceph vfs gain you over exporting by cephfs kernel module
(kernel 4.19). What does it lose you?
(I.e. pros and cons versus kernel module?)
Thanks!
C.
> It's based on vfs_ceph and you can read more about how to configure it
> yourself on
> https://www.samba.org/samba/docs/current/man-html/vfs_ceph.8.html.
On 3/13/20 10:47 PM, Chip Cox wrote:
> Konstantin - in your Windows environment, would it be beneficial to
> have the ability to have NTFS data land as S3 (object store) on a Ceph
> storage appliance? Or does it have to be NFS?
>
> Thanks and look forward to hearing back.
Nope, for windows we use CephFS over Samba VFS CTDB cluster.
k
Hello,
I think I've been running into an rbd export/import bug and wanted to see if anybody else had any experience.
We're using rbd images for VM drives both with and without custom stripe sizes. When we try to export/import the drive to another ceph cluster, the VM always comes up in a busted state it can't recover from. This happens both when doing this export/import through stdin/stdout and when using a middle machine as a temp space. I remember doing this a few times in previous versions without error, so I'm not sure if this is a regression or I'm doing something different. I'm still testing this to try and track down where the issue is but wanted to post this here to see if anybody else has any experience.
Example command: rbd -c /etc/ceph/cluster1.conf export pool/testvm.boot - | rbd -c /etc/ceph/cluster2.conf import - pool/testvm.boot
Current cluster is on 14.2.8 and using Ubuntu 18.04 w/ 5.3.0-40-generic.
Let me know if I can provide any more details to help track this down.
Thanks,
Hi,
I've always had some MGR stability issues with daemons crashing at
random times, but since the upgrade to 14.2.8 they regularly stop
responding after some time until I restart them (which I have to do at
least once a day).
I noticed right after the upgrade that the prometheus module was
entirely unresponsive and ceph fs status took about half a minute to
return. Once all the cluster chatter had settled and the PGs had been
rebalanced (auto-scale was messing with PGs after the upgarde), it
became usable again, but everything's still slower than before.
Prometheus takes several seconds to list metrics, ceph fs status takes
about 1-2 seconds.
However, after some time, MGRs stop responding and are kicked from the
list of standbys. With log level 5 all they are writing to the log files
is this:
2020-03-11 09:30:40.539 7f8f88984700 4 mgr[prometheus]
::ffff:xxx.xxx.xxx.xxx - - [11/Mar/2020:09:30:40] "GET /metrics
HTTP/1.1" 200 - "" "Prometheus/2.15.2"
2020-03-11 09:30:41.371 7f8f9ee62700 4 mgr send_beacon standby
2020-03-11 09:30:43.392 7f8f9ee62700 4 mgr send_beacon standby
2020-03-11 09:30:45.412 7f8f9ee62700 4 mgr send_beacon standby
2020-03-11 09:30:47.436 7f8f9ee62700 4 mgr send_beacon standby
2020-03-11 09:30:49.460 7f8f9ee62700 4 mgr send_beacon standby
I have seen another email on this list complaining about slow ceph fs
status, I believe this issue is connected.
Besides the standard always-on modules I have enabled the prometheus,
dashboard, and telemetry modules.
Best
Janek
Hi,
Due to the recent developments around the COVID-19 virus we (the
organizers) have decided to cancel the Ceph Day in Oslo on May 13th.
Altough it's still 8 weeks away we don't know how the situation will
develop and if travel will be possible or people are willing to travel.
Therefor we thought it was best to cancel the event for now and to
re-schedule to a later date in 2020.
We haven't picked a date yet. Once chosen we'll communicate it through
the regular channels.
Wido
Hi everyone:
There are two qos in ceph(one based on tokenbucket algorithm,another based on mclock ).
Which one I can use in production environment? Thank you