I'd like to discuss the questions I should ask to understand the values under the 'attrs' of an object in the following JSON data structure and evaluate the health of these objects:
I have a sample json output, can you comment on the object state here?
{ "name": "$image.name", "size": 0, "tag": "", "attrs": { "user.rgw.manifest": "", "user.rgw.olh.idtag": "$tag.uuid", "user.rgw.olh.info": "\u0001\u0001�", "user.rgw.olh.ver": "4" } }
What is the purpose of this fields?
"user.rgw.manifest" "user.rgw.olh.idtag" "user.rgw.olh.info" "user.rgw.olh.ver"
What does the empty value "", signify in the context of the object?
How does the absence of value in this field affect the object's health?
How is the content of this field generated? (For example, what does the "$tag" value represent?)
What is the function of this field?
What information does the content of this field carry about the object's status?
What does the content of this field signify? (For instance, what does "4" represent?) Does this field represent the object's version?
What are the distinguishing features that set this object apart from previous versions?
Hi,
Since the 6.5 kernel addressed the issue with regards to regression in
the readahead handling code... we went ahead and installed this kernel
for a couple of mail / web clusters (Ubuntu 6.5.1-060501-generic
#202309020842 SMP PREEMPT_DYNAMIC Sat Sep 2 08:48:34 UTC 2023 x86_64
x86_64 x86_64 GNU/Linux). Since then we occasionally see the following
being logged by the kernel:
[Sun Sep 10 07:19:00 2023] workqueue: delayed_work [ceph] hogged CPU for
>10000us 4 times, consider switching to WQ_UNBOUND
[Sun Sep 10 08:41:24 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 4 times, consider switching to WQ_UNBOUND
[Sun Sep 10 11:05:55 2023] workqueue: delayed_work [ceph] hogged CPU for
>10000us 8 times, consider switching to WQ_UNBOUND
[Sun Sep 10 12:54:38 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 8 times, consider switching to WQ_UNBOUND
[Sun Sep 10 19:06:37 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 16 times, consider switching to WQ_UNBOUND
[Mon Sep 11 10:53:33 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 32 times, consider switching to WQ_UNBOUND
[Tue Sep 12 10:14:03 2023] workqueue: ceph_con_workfn [libceph] hogged
CPU for >10000us 64 times, consider switching to WQ_UNBOUND
[Tue Sep 12 11:14:33 2023] workqueue: ceph_cap_reclaim_work [ceph]
hogged CPU for >10000us 4 times, consider switching to WQ_UNBOUND
We wonder if this is a new phenomenon, or that it's rather logged in the
new kernel and it was not before.
However, we have hit a few OOM situations since we switched to the new
kernel because of ceph_cap_reclaim_work events (OOM is because Apache
threads keep piling up as it cannot access CephFS). We then also see MDS
slow ops reported. This might be related to a backup job that is running
on a backup server. We did not observe this behavior on 5.12.19 kernel.
Ceph cluster is on 16.2.11 currently.
Anyone has some insight on this?
Thanks,
Stefan
Hi, Experts,
we have a ceph cluster report HEALTH_ERR due to multiple old versions.
health: HEALTH_ERR
There are daemons running multiple old versions of ceph
after run `ceph version`, we see three ceph versions in {16.2.*} , these daemons are ceph osd.
our question is: how to stop this version check , we cannot upgrade all old daemon.
Thanks,
Xiong
Hello!
I'm very new to ceph ,s orry I'm asking extremely basic questions.
I just upgraded 17.2.6 to 17.2.7 and got warning:
2 pool(s) do not have an application enabled
These pools are
5 cephfs.cephfs.meta
6 cephfs.cephfs.data
I don't remember why and how I created them, I just followed some
instruction...
And don't remember their state before upgrade :-(
And I see in dashboard 0 bytes is used in both pools.
But I have two other pools
3 cephfs_data
4 cephfs_metadata
which are in use by cephfs:
ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
and really have data in them.
Could you tell me, can I just remove these two pools without
application, if everything works , i.e. cephfs is mounted and accessible?
Thank you!
Firstly I'm rolling out a rook update from v1.12.2 to v1.12.7 (latest
stable) and ceph from 17.2.6 to 17.2.7 at the same time. I mention this in
case the problem is actually caused by rook rather than ceph. It looks like
ceph to my uninitiated eyes, though.
The update just started bumping my OSDs and the first one fails in the
'activate' init container. The complete logs for this container are:
+ OSD_ID=5
+ CEPH_FSID=<redacted>
+ OSD_UUID=<redacted>
+ OSD_STORE_FLAG=--bluestore
+ OSD_DATA_DIR=/var/lib/ceph/osd/ceph-5
+ CV_MODE=raw
+ DEVICE=/dev/sdc
+ cp --no-preserve=mode /etc/temp-ceph/ceph.conf /etc/ceph/ceph.conf
+ python3 -c '
import configparser
config = configparser.ConfigParser()
config.read('\''/etc/ceph/ceph.conf'\'')
if not config.has_section('\''global'\''):
config['\''global'\''] = {}
if not config.has_option('\''global'\'','\''fsid'\''):
config['\''global'\'']['\''fsid'\''] = '\''<redacted>'\''
with open('\''/etc/ceph/ceph.conf'\'', '\''w'\'') as configfile:
config.write(configfile)
'
+ ceph -n client.admin auth get-or-create osd.5 mon 'allow profile osd' mgr
'allow profile osd' osd 'allow *' -k /etc/ceph/admin-keyring-store/keyring
[osd.5]
key = <redacted>
+ [[ raw == \l\v\m ]]
++ mktemp
+ OSD_LIST=/tmp/tmp.CekJVsr9gr
+ ceph-volume raw list /dev/sdc
Traceback (most recent call last):
File "/usr/sbin/ceph-volume", line 11, in <module>
load_entry_point('ceph-volume==1.0.0', 'console_scripts',
'ceph-volume')()
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in
__init__
self.main(self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line
59, in newfunc
return f(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in
main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line
194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/main.py",
line 32, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line
194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py",
line 166, in main
self.list(args)
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line
16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py",
line 122, in list
report = self.generate(args.device)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/raw/list.py",
line 91, in generate
info_device = [info for info in info_devices if info['NAME'] == dev][0]
IndexError: list index out of range
So it has failed executing `ceph-volume raw list /dev/sdc`.
It looks like this code is new in 17.2.7. Is this a regression? What would
be the simplest way to back out of it?
Thanks,
Matt
--
Matthew Booth
Hi,
I'm using Ceph on a 4-host cluster for a year now. I recently discovered the Ceph Dashboard :-)
No I see that the Dashboard reports CephNodeNetworkPacketErrors >0.01% or >10 packets/s...
Although all systems work great, I'm worried.
'ip -s link show eno5' results:
2: eno5: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
link/ether 7a:3b:79:9c:f6:d1 brd ff:ff:ff:ff:ff:ff permaddr 5c:ba:2c:08:b3:90
RX: bytes packets errors dropped missed mcast
734153938129 645770129 20160 0 0 342301
TX: bytes packets errors dropped carrier collsns
1085134190597 923843839 0 0 0 0
altname enp178s0f0
So in average 0,0003% of RX packet errors!
All the four hosts uses the same 10Gb HP switch. The hosts themselves are HP Proliant G10 servers. I would expect 0% packet loss...
Anyway. Should I be worried about data consistency? Or can Ceph handle this amount of packet errors?
Greetings,
Dominique.
I have a 3 node ceph cluster in my home lab. One of the pools spans 3
hdds, one on each node, and has size 2, min size 1. One of my nodes is
currently down, and I have 160 pgs in 'unknown' state. The other 2
hosts are up and the cluster has quorum.
Example `ceph health detail` output:
pg 9.0 is stuck inactive for 25h, current state unknown, last acting []
I have 3 questions:
Why would the pgs be in an unknown state?
I would like to recover the cluster without recovering the failed
node, primarily so that I know I can. Is that possible?
The boot nvme of the host has failed, so I will most likely rebuild
it. I'm running rook, and I will most likely delete the old node and
create a new one with the same name. AFAIK, the OSDs are fine. When
rook rediscovers the OSDs, will it add them back with data intact? If
not, is there any way I can make it so it will?
Thanks!
--
Matthew Booth
Hello.,
I've recently made the decision to gradually decommission my Nautilus
cluster and migrate the hardware to a new Pacific or Quincy cluster. By
gradually, I mean that as I expand the new cluster I will move (copy/erase)
content from the old cluster to the new, making room to decommission more
nodes and move them over.
In order to do this I will, of course, need to remove OSD nodes by first
emptying the OSDs on each node.
I noticed that pgremapper (a version prior to October 2021) has a 'drain'
subcommand that allows one to control which target OSDs would receive the
PGs from the source OSD being drained. This seemed like a good idea: If
one simply marks an OSD 'out', it's contents would be rebalanced to other
OSDs on the same node that are still active, which seems like it would make
a lot of unnecessary data movement and also make removing the next OSD take
longer.
So I went through the trouble of creating a 'really long' pgremapper drain
command excluding the OSDs of two nodes as targets:
# bin/pgremapper drain 16 --target-osds
00,01,02,03,04,05,06,07,24,25,16,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71
--allow-movement-across host --max-source-backfills 75 --concurrency 20
--verbose --yes
However, when this is complete OSD 16 actually contains more PGs than
before I started. It appears that the mapping generated by pgremapper also
back-filled the OSD as it was draining it.
So did I miss something here? What is the best way to proceed? I
understand that it would be mayhem to mark 8 of 72 OSDs out and then turn
backfill/rebalance/recover back on. But it seems like there should be a
better way.
Suggestions?
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdhall(a)binghamton.edu
Hi,
One of the server in Ceph cluster accidentally shutdown abruptly due to
power failure. After restarting OSD's not coming up and in Ceph health
check it shows osd down.
When checking OSD status "osd.26 18865 unable to obtain rotating service
keys; retrying"
For every 30 seconds it's just putting a message and it's all the same in
all OSD in the system.
Nov 04 20:03:05 strg-node-03 bash[34287]: debug
2023-11-04T14:33:05.089+0000 7f1f5693c080 -1 osd.26 18865 unable to obtain
rotating service keys; retrying
Nov 04 20:03:35 strg-node-03 bash[34287]: debug
2023-11-04T14:33:35.090+0000 7f1f5693c080 -1 osd.26 18865 unable to obtain
rotating service keys; retrying
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific
(stable) on Debian 11 bullseye and cephadm based installation.
Tried to search for errors and msg couldn't find anything useful.
How do I fix this issue ?
regards,
Amudhan
Hello Users,
It is great to hear a note about RGW "S3 multipart uploads using
Server-Side Encryption now replicate correctly in multi-site" in Quincy
v17.2.7 release. But I see that users who are using [1] still have a
dependency on the item tracked at [2].
I tested with Reef 18.2.0 as well and the PR [3] seems to be merged into
reef but the configuration is not being effective.
We're currently stuck at version 17.2.3 and in a situation where we can not
upgrade to later because "rgw_crypt_default_encryption_key" is still WIP
and also MPU with SSE (default key) is only fixed in 17.2.7 :(
[1]
https://docs.ceph.com/en/quincy/radosgw/encryption/#automatic-encryption-fo…
[2] https://tracker.ceph.com/issues/61473
[3] https://github.com/ceph/ceph/pull/52796
Appreciate any help.
Regards,
Jayanth