After upgrading one of our clusters from Nautilus 14.2.2 to Nautilus 14.2.5 I'm seeing 100% CPU usage by a single ceph-mgr thread (found using 'top -H'). Attaching to the thread with strace shows a lot of mmap and munmap calls. Here's the distribution after watching it for a few minutes:
48.73% - mmap
49.48% - munmap
1.75% - futex
0.05% - madvise
I've upgraded 3 other clusters so far (120 OSDs, 30 OSDs, 200 OSDs), but this is the only one which has seen the problem (355 OSDs). Perhaps it has something to do with its size?
I was suspecting it might have to do with one of the modules misbehaving, so I disabled all of them:
# ceph mgr module ls | jq -r '.enabled_modules'
[]
But that didn't help (I restarted the mgrs after disabling the modules too).
I also tried setting debug_mgr and debug_mgrc to 20, but nothing popped out at me as being the cause of the problem.
It only seems to affect the active mgr. If I stop the active mgr the problem moves to one of the other mgrs.
Any guesses or tips on what next steps I should take to figure out what's going on?
Thanks,
Bryan
Hi,
I just looked through the rbd driver of OpenStack cinder. It seems there is no additional clear_volume step implemented for rbd driver. In my case, objects of this rbd image were deleted partially, so I doubt it’s related Ceph instead of Cinder driver.
br,
Xu Yun
> 2020年1月15日 下午7:36,EDH - Manuel Rios <mriosfer(a)easydatahost.com> 写道:
>
> Hi
>
> For huge volumes in Openstack and Ceph, setup in your cinder this param:
>
> volume_clear_size = 50
>
> That will wipe only the first 50MB of the file and then ask to ceph to fully delete instead wipe all disk with zeros that sometimes in huge volumes cause timeout.
>
> In our deploy that was the solution, Openstack Queens here
>
>
> -----Mensaje original-----
> De: Eugen Block <eblock(a)nde.ag>
> Enviado el: miércoles, 15 de enero de 2020 8:51
> Para: ceph-users(a)ceph.io
> Asunto: [ceph-users] Re: Objects not removed (completely) when removing a rbd image
>
> Hi,
>
> this might happen if you try to delete images/instances/volumes in openstack that are somehow linked, e.g. if there are snapshots etc. I have experienced this in Ocata, too. Deleting a base image worked but there were existing clones so basically just the openstack database was updated, but the base image still existed within ceph.
>
> Try to figure out if that is also the case. If it's something else, check the logs in your openstack environment, maybe they reveal something. Also check the ceph logs.
>
> Regards,
> Eugen
>
>
> Zitat von 徐蕴 <yunxu(a)me.com>:
>
>> Hello,
>>
>> My setup is Ceph pike working with OpenStack. When I deleted an image,
>> I found that the space was not reclaimed. I checked with rbd ls and
>> confirmed that this image was disappeared. But when I check the
>> objects with rados ls, most objects named rbd_data.xxx are still
>> existed in my cluster. rbd_object_map and rbd_header were already
>> deleted. I waited for several hours and there is no further deletion
>> happed. Is it a known issue, or something wrong with my configuration?
>>
>> br,
>> Xu Yun
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
Hello,
My setup is Ceph pike working with OpenStack. When I deleted an image, I found that the space was not reclaimed. I checked with rbd ls and confirmed that this image was disappeared. But when I check the objects with rados ls, most objects named rbd_data.xxx are still existed in my cluster. rbd_object_map and rbd_header were already deleted. I waited for several hours and there is no further deletion happed. Is it a known issue, or something wrong with my configuration?
br,
Xu Yun
Has anyone ever tried using this feature? I've added it to the [global]
section of the ceph.conf on my POC cluster but I'm not sure how to tell if
it's actually working. I did find a reference to this feature via Google and
they had it in their [OSD] section?? I've tried that too..
TIA
Adam
Hi,
When we tried putting some load on our test cephfs setup by restoring a
backup in artifactory, we eventually ran out of space (around 95% used
in `df` = 3.5TB) which caused artifactory to abort the restore and clean
up. However, while a simple `find` no longer shows the files, `df` still
claims that we have around 2.1TB of data on the cephfs. `df -i` also
shows 2.4M used inodes. When using `du -sh` on a top-level mountpoint, I
get 31G used, which is data that is still really here and which is
expected to be here.
Consequently, we also get the following warning:
> MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average
> pool cephfs_data objects per pg (38711) is more than 231.802 times cluster average (167)
We are running ceph 14.2.5.
We have snapshots enabled on cephfs, but there are currently no active
snapshots listed by `ceph daemon mds.$hostname dump snaps --server` (see
below). I can't say for sure if we created snapshots during the backup
restore.
> {
> "last_snap": 39,
> "last_created": 38,
> "last_destroyed": 39,
> "pending_noop": [],
> "snaps": [],
> "need_to_purge": {},
> "pending_update": [],
> "pending_destroy": []
> }
We only have a single CephFS.
We use the pool_namespace xattr for our various directory trees on the
cephfs.
`ceph df` shows:
> POOL ID STORED OBJECTS USED %USED MAX AVAIL
> cephfs_data 6 2.1 TiB 2.48M 2.1 TiB 24.97 3.1 TiB
`ceph daemon mds.$hostname perf dump | grep stray` shows:
> "num_strays": 0,
> "num_strays_delayed": 0,
> "num_strays_enqueuing": 0,
> "strays_created": 5097138,
> "strays_enqueued": 5097138,
> "strays_reintegrated": 0,
> "strays_migrated": 0,
`rados -p cephfs_data df` shows:
> POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
> cephfs_data 2.1 TiB 2477540 0 4955080 0 0 0 10699626 6.9 TiB 86911076 35 TiB 0 B 0 B
>
> total_objects 29718
> total_used 329 GiB
> total_avail 7.5 TiB
> total_space 7.8 TiB
When I combine the usage and the free space shown by `df` we would
exceed our cluster size. Our test cluster currently has 7.8TB total
space with a replication size of 2 for all pools. With 2.1TB
"used" on the cephfs according to `df` + 3.1TB being shows as "free" I
get 5.2TB total size. This would mean >10TB of data when accounted for
replication. Clearly this can't fit on a cluster with only 7.8TB of
capacity.
Do you have any ideas why we see so many objects and so much reported
usage? Is there any way to fix this without recreating the cephfs?
Florian
--
Florian Pritz
Research Industrial Systems Engineering (RISE) Forschungs-,
Entwicklungs- und Großprojektberatung GmbH
Concorde Business Park F
2320 Schwechat
Austria
E-Mail: florian.pritz(a)rise-world.com
Web: www.rise-world.com
Firmenbuch: FN 280353i
Landesgericht Korneuburg
UID: ATU62886416
Hi all,
When upgrading from Luminous to Nautilus the global configmap options
for cluster_network and public_network were inadvertently set to an
incorrect value (10.192.80.0/24):
-----
[root@ceph-osd134 ceph]# ceph config dump | grep network
<snip>
global advanced cluster_network 10.192.80.0/24
*
global advanced public_network 10.192.80.0/24
-----
Ceph.conf on all nodes is correctly set to 10.0.0.0/8. Even after
restarting the mons I see the following errors with every ceph
command:
-----
2020-01-10 20:06:30.815 7f6deffff700 -1 set_mon_vals failed to set
cluster_network = 10.192.80.0/24: Configuration option
'cluster_network' may not be modified at runtime
2020-01-10 20:06:30.815 7f6deffff700 -1 set_mon_vals failed to set
public_network = 10.192.80.0/24: Configuration option 'public_network'
may not be modified at runtime
-----
How do I safely change/remove the centralized config network settings?
Thanks,
Frank
Hi All,
Sorry for the repost.
How do you unset a global config setting from the centralized config
with mimic+ (specifically public_network and cluster_network)
"ceph config rm global public_network"
doesn't seem to do the trick.
These were set inadvertently during an upgrade with:
"ceph config assimilate-conf"
https://ceph.io/community/new-mimic-centralized-configuration-management/
The settings I wish to unset:
-----
[root@ceph-mon001 ceph]# ceph config dump | grep network
global advanced cluster_network 10.192.80.0/24
*
global advanced public_network 10.192.80.0/24
*
-----
thx
Frank