August 2023 - ceph-users

by Spiros Papageorgiou

Hi all, I have a ceph cluster with 3 nodes. ceph version is 16.2.9. There are 7 SSD OSDs on each server and one pool that resides on these OSDs. My OSDs are terribly unbalanced: ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -9 28.42200 - 28 TiB 9.3 TiB 9.2 TiB 161 MiB 26 GiB 19 TiB 32.56 1.09 - root ssddisks -2 9.47400 - 9.5 TiB 3.4 TiB 3.4 TiB 66 MiB 9.2 GiB 6.1 TiB 35.52 1.19 - host px1-ssd 0 ssd 1.74599 0.85004 1.7 TiB 810 GiB 807 GiB 3.2 MiB 2.3 GiB 978 GiB 45.28 1.51 26 up osd.0 5 ssd 0.82999 0.85004 850 GiB 581 GiB 580 GiB 22 MiB 912 MiB 269 GiB 68.38 2.29 19 up osd.5 6 ssd 0.82999 1.00000 850 GiB 8.2 GiB 7.8 GiB 9.5 MiB 435 MiB 842 GiB 0.97 0.03 4 up osd.6 7 ssd 0.82999 1.00000 850 GiB 294 GiB 293 GiB 26 MiB 591 MiB 556 GiB 34.60 1.16 11 up osd.7 16 ssd 1.74599 0.85004 1.7 TiB 872 GiB 869 GiB 3.1 MiB 2.3 GiB 916 GiB 48.75 1.63 27 up osd.16 23 ssd 1.74599 1.00000 1.7 TiB 438 GiB 436 GiB 1.5 MiB 1.7 GiB 1.3 TiB 24.48 0.82 14 up osd.23 24 ssd 1.74599 1.00000 1.7 TiB 444 GiB 443 GiB 1.6 MiB 1.0 GiB 1.3 TiB 24.81 0.83 17 up osd.24 -6 9.47400 - 9.5 TiB 2.9 TiB 2.9 TiB 46 MiB 8.1 GiB 6.6 TiB 30.39 1.02 - host px2-ssd 12 ssd 0.82999 1.00000 850 GiB 154 GiB 154 GiB 21 MiB 368 MiB 696 GiB 18.16 0.61 9 up osd.12 13 ssd 0.82999 1.00000 850 GiB 144 GiB 143 GiB 527 KiB 469 MiB 706 GiB 16.92 0.57 4 up osd.13 14 ssd 0.82999 1.00000 850 GiB 149 GiB 149 GiB 16 MiB 299 MiB 700 GiB 17.58 0.59 7 up osd.14 29 ssd 1.74599 1.00000 1.7 TiB 449 GiB 448 GiB 1.6 MiB 1.4 GiB 1.3 TiB 25.11 0.84 20 up osd.29 30 ssd 1.74599 0.85004 1.7 TiB 885 GiB 882 GiB 3.1 MiB 2.3 GiB 903 GiB 49.48 1.65 31 up osd.30 31 ssd 1.74599 1.00000 1.7 TiB 728 GiB 727 GiB 2.6 MiB 1.8 GiB 1.0 TiB 40.74 1.36 22 up osd.31 32 ssd 1.74599 1.00000 1.7 TiB 438 GiB 437 GiB 1.6 MiB 1.4 GiB 1.3 TiB 24.51 0.82 15 up osd.32 -4 9.47400 - 9.5 TiB 3.0 TiB 3.0 TiB 49 MiB 8.7 GiB 6.5 TiB 31.78 1.06 - host px3-ssd 19 ssd 0.82999 1.00000 850 GiB 293 GiB 292 GiB 14 MiB 500 MiB 557 GiB 34.47 1.15 9 up osd.19 20 ssd 0.82999 1.00000 850 GiB 290 GiB 290 GiB 10 MiB 482 MiB 560 GiB 34.15 1.14 10 up osd.20 21 ssd 0.82999 1.00000 850 GiB 148 GiB 147 GiB 16 MiB 428 MiB 702 GiB 17.36 0.58 5 up osd.21 25 ssd 1.74599 1.00000 1.7 TiB 446 GiB 445 GiB 1.8 MiB 1.6 GiB 1.3 TiB 24.96 0.83 19 up osd.25 26 ssd 1.74599 1.00000 1.7 TiB 739 GiB 737 GiB 2.6 MiB 2.0 GiB 1.0 TiB 41.33 1.38 29 up osd.26 27 ssd 1.74599 1.00000 1.7 TiB 725 GiB 723 GiB 2.6 MiB 2.1 GiB 1.0 TiB 40.55 1.36 21 up osd.27 28 ssd 1.74599 1.00000 1.7 TiB 442 GiB 440 GiB 1.6 MiB 1.7 GiB 1.3 TiB 24.72 0.83 17 up osd.28 I have done a "ceph osd reweight-by-utilization" and "ceph osd set-require-min-compat-client luminous". The pool has 32 PGs which were set by autoscale_mode, which is on. Why are my OSDs, so unbalanced? I have osd.5 with 68.3% and osd.6 with 0.97%.... Also when the reweight-by-utilization, osd.5 utilization actually increased... What am i missing here? Sp

9 months, 2 weeks

3
3
0 0

ceph-csi-cephfs - InvalidArgument desc = provided secret is empty

by Shawn Weeks

I’m attempting to setup the CephFS CSI on K3s managed by Rancher against an external CephFS using the Helm chart. I’m using all default values on the Helm chart accept for cephConf and secret. I’ve verified that the configmap ceph-config get’s created with the values from Helm and I’ve verified that the secret csi-cephfs-secret also get’s created with the same values as seen below. Any attempts to create a PVC result in the following error. The only posts I’ve found are about expansion and I am not trying to expand a CephFS volume, just create one. I0803 19:23:39.715036 1 event.go:298] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"coder", Name:"test", UID:"9c7e51b6-0321-48e1-9950-444f786c14fb", APIVersion:"v1", ResourceVersion:"4523108", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "cephfs": rpc error: code = InvalidArgument desc = provided secret is empty cephConfConfigMapName: ceph-config cephconf: | [global] fsid = 9b98ccd8-450e-4172-af70-512e4e77bc36 mon_host = [v2:10.0.5.11:3300/0,v1:10.0.5.11:6789/0] [v2:10.0.5.12:3300/0,v1:10.0.5.12:6789/0] [v2:10.0.5.13:3300/0,v1:10.0.5.13:6789/0] commonLabels: {} configMapName: ceph-csi-config csiConfig: null driverName: cephfs.csi.ceph.com externallyManagedConfigmap: false kubeletDir: /var/lib/kubelet logLevel: 5 nodeplugin: affinity: {} fusemountoptions: '' httpMetrics: containerPort: 8081 enabled: true service: annotations: {} clusterIP: '' enabled: true externalIPs: null loadBalancerIP: '' loadBalancerSourceRanges: null servicePort: 8080 type: ClusterIP imagePullSecrets: null kernelmountoptions: '' name: nodeplugin nodeSelector: {} plugin: image: pullPolicy: IfNotPresent repository: quay.io/cephcsi/cephcsi tag: v3.9.0 resources: {} priorityClassName: system-node-critical profiling: enabled: false registrar: image: pullPolicy: IfNotPresent repository: registry.k8s.io/sig-storage/csi-node-driver-registrar tag: v2.8.0 resources: {} tolerations: null updateStrategy: RollingUpdate pluginSocketFile: csi.sock provisioner: affinity: {} enableHostNetwork: false httpMetrics: containerPort: 8081 enabled: true service: annotations: {} clusterIP: '' enabled: true externalIPs: null loadBalancerIP: '' loadBalancerSourceRanges: null servicePort: 8080 type: ClusterIP imagePullSecrets: null name: provisioner nodeSelector: {} priorityClassName: system-cluster-critical profiling: enabled: false provisioner: extraArgs: null image: pullPolicy: IfNotPresent repository: registry.k8s.io/sig-storage/csi-provisioner tag: v3.5.0 resources: {} replicaCount: 3 resizer: enabled: true extraArgs: null image: pullPolicy: IfNotPresent repository: registry.k8s.io/sig-storage/csi-resizer tag: v1.8.0 name: resizer resources: {} setmetadata: true snapshotter: extraArgs: null image: pullPolicy: IfNotPresent repository: registry.k8s.io/sig-storage/csi-snapshotter tag: v6.2.2 resources: {} strategy: rollingUpdate: maxUnavailable: 50% type: RollingUpdate timeout: 60s tolerations: null provisionerSocketFile: csi-provisioner.sock rbac: create: true secret: adminID: <my keyring is for client.home so i put home here> adminKey: <exact value from my keyring here> create: true name: csi-cephfs-secret selinuxMount: true serviceAccounts: nodeplugin: create: true name: null provisioner: create: true name: null sidecarLogLevel: 1 storageClass: allowVolumeExpansion: true annotations: {} clusterID: <cluster-ID> controllerExpandSecret: csi-cephfs-secret controllerExpandSecretNamespace: '' create: false fsName: myfs fuseMountOptions: '' kernelMountOptions: '' mountOptions: null mounter: '' name: csi-cephfs-sc nodeStageSecret: csi-cephfs-secret nodeStageSecretNamespace: '' pool: '' provisionerSecret: csi-cephfs-secret provisionerSecretNamespace: '' reclaimPolicy: Delete volumeNamePrefix: '' global: cattle: clusterId: c-m-xschvkd5 clusterName: dev-cluster rkePathPrefix: '' rkeWindowsPathPrefix: '' systemProjectId: p-g6rqs url: https://rancher.example.com

9 months, 2 weeks

1
0
0 0

Luminous Bluestore issues and RGW Multi-site Recovery

by Gregory O'Neill

Hello, I have two main questions here. 1. What can I do when `ceph-bluestore-tool` outputs a stack trace for `fsck`? 2. How does one recover from lost PGs / data corruption in an RGW Multi-site setup? --- I have a Luminous 12.2.12 cluster built on ceph/daemon:v3.2.10-stable-3.2-luminous-centos-7-x86_64 for all daemons, no ceph packages are installed on the systems. The OSD nodes have 128GB RAM, 6 SATA SSDs (Micron 5200, 2TB) and 1 NVMe SSD split into 4 OSDs. osd_memory_target is set to 10GB and the OSD nodes have 128GB of RAM. That should put me at 100/128GB used. There are 3 PGs down, 3 of the OSDs that had those PGs won't stay online, and they crash fairly quickly after starting. These are running on SATA SSDs which are being replaced with NVMe SSDs. Crush re-weighting the SATA drives down causes some SATA OSDs to crash and some NVMe drives have slow or blocked ops (related to the down PGs). I installed the ceph-osd package on one OSD host. When I ran `ceph-bluestore-tool`, I got a bunch of tcmalloc and unexpected aio errors. Exact output below. I also tried `ceph-objectstore-tool` but received similar results. I cloned the other OSD that has the affected PGs to have a copy I can work on, but I got the exact same results as before. --- From what I can see, this is likely due to bad drives and automation trying to restart down OSDs several times. With 3 down PGs, I am assuming my next step would be to mark those PGs lost. From there, I am unsure what the recovery procedure is to sync "clean" data from other zones into the cluster that was impacted. Is RGW able to handle this? Do I need to use `rclone`? --- $ ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-11 fsck tcmalloc: large alloc 1283989504 bytes == 0x557fdbe46000 @ 0x7fc87e4126d0 0x7fc873354ae9 0x7fc873356073 0x557f89d3d680 0x557f89d2ebcd 0x557f89d30524 0x557f89d318ef 0x557f89d33147 0x557f89bb0d6f 0x557f89b3c91b 0x557f89b6df8a 0x557f89a2c5e1 0x7fc87299d2e1 0x557f89ab03fa (nil) tcmalloc: large alloc 2567970816 bytes == 0x5580286c8000 @ 0x7fc87e4126d0 0x7fc873354ae9 0x7fc873356073 0x557f89d3d680 0x557f89d2ebcd 0x557f89d30524 0x557f89d318ef 0x557f89d33147 0x557f89bb0d6f 0x557f89b3c91b 0x557f89b6df8a 0x557f89a2c5e1 0x7fc87299d2e1 0x557f89ab03fa (nil) tcmalloc: large alloc 5135933440 bytes == 0x5580c17ca000 @ 0x7fc87e4126d0 0x7fc873354ae9 0x7fc873356073 0x557f89d3d680 0x557f89d2ebcd 0x557f89d30524 0x557f89d318ef 0x557f89d33147 0x557f89bb0d6f 0x557f89b3c91b 0x557f89b6df8a 0x557f89a2c5e1 0x7fc87299d2e1 0x557f89ab03fa (nil) tcmalloc: large alloc 3025510400 bytes == 0x557f8f6e6000 @ 0x7fc87e4126d0 0x7fc873354ae9 0x7fc87335582b 0x557f89d75d19 0x557f89d2edda 0x557f89d30524 0x557f89d318ef 0x557f89d33147 0x557f89bb0d6f 0x557f89b3c91b 0x557f89b6df8a 0x557f89a2c5e1 0x7fc87299d2e1 0x557f89ab03fa (nil) tcmalloc: large alloc 2269913088 bytes == 0x55832469e000 @ 0x7fc87e3f2e50 0x7fc87e4121b9 0x7fc8756ca4f7 0x7fc8756cd304 0x557f89cc4661 0x557f89ad0858 0x557f89ad2224 0x557f89cb7b1d 0x557f89de584c 0x557f89de6a7e 0x557f89e05e7b 0x557f89d2cf48 0x557f89d2efd2 0x557f89d30524 0x557f89d318ef 0x557f89d33147 0x557f89bb0d6f 0x557f89b3c91b 0x557f89b6df8a 0x557f89a2c5e1 0x7fc87299d2e1 0x557f89ab03fa (nil) 2023-07-30 08:27:27.531919 7fc86f689700 -1 bdev(0x557f8add4240 /var/lib/ceph/osd/ceph-11/block) aio to 929504952320~2269908992 but returned: 2147479552/build/ceph-12.2.12/src/os/bluestore/KernelDevice.cc: In function 'void KernelDevice::_aio_thread()' thread 7fc86f689700 time 2023-07-30 08:27:27.532004 /build/ceph-12.2.12/src/os/bluestore/KernelDevice.cc: 397: FAILED assert(0 == "unexpected aio error") ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fc8757242c2] 2: (KernelDevice::_aio_thread()+0x1377) [0x557f89cc14c7] 3: (KernelDevice::AioCompletionThread::entry()+0xd) [0x557f89cc725d] 4: (()+0x74a4) [0x7fc8740104a4] 5: (clone()+0x3f) [0x7fc872a65d0f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2023-07-30 08:27:27.544215 7fc86f689700 -1 /build/ceph-12.2.12/src/os/bluestore/KernelDevice.cc: In function 'void KernelDevice::_aio_thread()' thread 7fc86f689700 time 2023-07-30 08:27:27.532004 /build/ceph-12.2.12/src/os/bluestore/KernelDevice.cc: 397: FAILED assert(0 == "unexpected aio error") ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fc8757242c2] 2: (KernelDevice::_aio_thread()+0x1377) [0x557f89cc14c7] 3: (KernelDevice::AioCompletionThread::entry()+0xd) [0x557f89cc725d] 4: (()+0x74a4) [0x7fc8740104a4] 5: (clone()+0x3f) [0x7fc872a65d0f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. -1> 2023-07-30 08:27:27.531919 7fc86f689700 -1 bdev(0x557f8add4240 /var/lib/ceph/osd/ceph-11/block) aio to 929504952320~2269908992 but returned: 2147479552 0> 2023-07-30 08:27:27.544215 7fc86f689700 -1 /build/ceph-12.2.12/src/os/bluestore/KernelDevice.cc: In function 'void KernelDevice::_aio_thread()' thread 7fc86f689700 time 2023-07-30 08:27:27.532004 /build/ceph-12.2.12/src/os/bluestore/KernelDevice.cc: 397: FAILED assert(0 == "unexpected aio error") ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7fc8757242c2] 2: (KernelDevice::_aio_thread()+0x1377) [0x557f89cc14c7] 3: (KernelDevice::AioCompletionThread::entry()+0xd) [0x557f89cc725d] 4: (()+0x74a4) [0x7fc8740104a4] 5: (clone()+0x3f) [0x7fc872a65d0f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. *** Caught signal (Aborted) ** in thread 7fc86f689700 thread_name:bstore_aio ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (()+0x424fc4) [0x557f89d25fc4] 2: (()+0x110e0) [0x7fc87401a0e0] 3: (gsignal()+0xcf) [0x7fc8729affff] 4: (abort()+0x16a) [0x7fc8729b142a] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x7fc87572444e] 6: (KernelDevice::_aio_thread()+0x1377) [0x557f89cc14c7] 7: (KernelDevice::AioCompletionThread::entry()+0xd) [0x557f89cc725d] 8: (()+0x74a4) [0x7fc8740104a4] 9: (clone()+0x3f) [0x7fc872a65d0f] 2023-07-30 08:27:27.549175 7fc86f689700 -1 *** Caught signal (Aborted) ** in thread 7fc86f689700 thread_name:bstore_aio ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (()+0x424fc4) [0x557f89d25fc4] 2: (()+0x110e0) [0x7fc87401a0e0] 3: (gsignal()+0xcf) [0x7fc8729affff] 4: (abort()+0x16a) [0x7fc8729b142a] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x7fc87572444e] 6: (KernelDevice::_aio_thread()+0x1377) [0x557f89cc14c7] 7: (KernelDevice::AioCompletionThread::entry()+0xd) [0x557f89cc725d] 8: (()+0x74a4) [0x7fc8740104a4] 9: (clone()+0x3f) [0x7fc872a65d0f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 0> 2023-07-30 08:27:27.549175 7fc86f689700 -1 *** Caught signal (Aborted) ** in thread 7fc86f689700 thread_name:bstore_aio ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (()+0x424fc4) [0x557f89d25fc4] 2: (()+0x110e0) [0x7fc87401a0e0] 3: (gsignal()+0xcf) [0x7fc8729affff] 4: (abort()+0x16a) [0x7fc8729b142a] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x7fc87572444e] 6: (KernelDevice::_aio_thread()+0x1377) [0x557f89cc14c7] 7: (KernelDevice::AioCompletionThread::entry()+0xd) [0x557f89cc725d] 8: (()+0x74a4) [0x7fc8740104a4] 9: (clone()+0x3f) [0x7fc872a65d0f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Aborted $ ceph-objectstore-tool --data-path=/var/lib/ceph/osd/ceph-11 --op list-pgs tcmalloc: large alloc 1283989504 bytes == 0x5649b1bdc000 @ 0x7f3af5e756d0 0x7f3aeafbbae9 0x7f3aeafbd073 0x56495defb9e0 0x56495deed01d 0x56495deee974 0x56495deefd3f 0x56495def1597 0x56495de0e47f 0x56495dd95dab 0x56495ddcf9e4 0x56495d7de4db 0x7f3aea6042e1 0x56495d86853a (nil) tcmalloc: large alloc 2567970816 bytes == 0x5649fe45e000 @ 0x7f3af5e756d0 0x7f3aeafbbae9 0x7f3aeafbd073 0x56495defb9e0 0x56495deed01d 0x56495deee974 0x56495deefd3f 0x56495def1597 0x56495de0e47f 0x56495dd95dab 0x56495ddcf9e4 0x56495d7de4db 0x7f3aea6042e1 0x56495d86853a (nil) tcmalloc: large alloc 5135933440 bytes == 0x564a97560000 @ 0x7f3af5e756d0 0x7f3aeafbbae9 0x7f3aeafbd073 0x56495defb9e0 0x56495deed01d 0x56495deee974 0x56495deefd3f 0x56495def1597 0x56495de0e47f 0x56495dd95dab 0x56495ddcf9e4 0x56495d7de4db 0x7f3aea6042e1 0x56495d86853a (nil) tcmalloc: large alloc 3025510400 bytes == 0x56496547c000 @ 0x7f3af5e756d0 0x7f3aeafbbae9 0x7f3aeafbc82b 0x56495df34079 0x56495deed22a 0x56495deee974 0x56495deefd3f 0x56495def1597 0x56495de0e47f 0x56495dd95dab 0x56495ddcf9e4 0x56495d7de4db 0x7f3aea6042e1 0x56495d86853a (nil) tcmalloc: large alloc 2269913088 bytes == 0x564cfa402000 @ 0x7f3af5e55e50 0x7f3af5e751b9 0x7f3aed12d4f7 0x7f3aed130304 0x56495de9fbc1 0x56495de7a5f8 0x56495de7bfc4 0x56495de9307d 0x56495dfa32dc 0x56495dfa450e 0x56495dfc34db 0x56495deeb398 0x56495deed422 0x56495deee974 0x56495deefd3f 0x56495def1597 0x56495de0e47f 0x56495dd95dab 0x56495ddcf9e4 0x56495d7de4db 0x7f3aea6042e1 0x56495d86853a (nil) /build/ceph-12.2.12/src/os/bluestore/KernelDevice.cc: In function 'void KernelDevice::_aio_thread()' thread 7f3ae72f0700 time 2023-07-30 08:37:16.531432 /build/ceph-12.2.12/src/os/bluestore/KernelDevice.cc: 397: FAILED assert(0 == "unexpected aio error") ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f3aed1872c2] 2: (KernelDevice::_aio_thread()+0x1377) [0x56495de9ca27] 3: (KernelDevice::AioCompletionThread::entry()+0xd) [0x56495dea27bd] 4: (()+0x74a4) [0x7f3aeba734a4] 5: (clone()+0x3f) [0x7f3aea6ccd0f] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. *** Caught signal (Aborted) ** in thread 7f3ae72f0700 thread_name:bstore_aio ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable) 1: (()+0x94a0f4) [0x56495debe0f4] 2: (()+0x110e0) [0x7f3aeba7d0e0] 3: (gsignal()+0xcf) [0x7f3aea616fff] 4: (abort()+0x16a) [0x7f3aea61842a] 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x28e) [0x7f3aed18744e] 6: (KernelDevice::_aio_thread()+0x1377) [0x56495de9ca27] 7: (KernelDevice::AioCompletionThread::entry()+0xd) [0x56495dea27bd] 8: (()+0x74a4) [0x7f3aeba734a4] 9: (clone()+0x3f) [0x7f3aea6ccd0f] Aborted -- Gregory O’Neill

9 months, 2 weeks

3
3
0 0

Re: ceph-volume lvm migrate error

by Roland Giesler

On 2023/08/02 13:29, Roland Giesler wrote: > > On 2023/08/02 12:53, Igor Fedotov wrote: >> Roland, >> >> First of all there are no block.db/block.wal symlinks in OSD folder. >> Which means there are no standalone DB/WAL any more. > > That is surprising. So ceph-volume is not able to extract the DB/WAL > from an OSD to migrate it it seems? I figured out the if one doesn't specify and separate LV for the DB/WAL, it is integrated into the data drive. However, one can create a new DB/WAL for and OSD as follows: # systemctl stop ceph-osd@14 # ceph-bluestore-tool bluefs-bdev-new-db --path /var/lib/ceph/osd/ceph-14 --dev-target /dev/NodeC-nvme1/NodeC-nvme-LV-RocksDB1 --bluestore-block-db-size 45G inferring bluefs devices from bluestore path DB device added /dev/dm-20 # systemctl start ceph-osd@14 And, viola!, it did it. # ls -la /var/lib/ceph/osd/ceph-14/block* lrwxrwxrwx 1 ceph ceph 50 Dec 25 2022 /var/lib/ceph/osd/ceph-14/block -> /dev/mapper/0GVWr9-dQ65-LHcx-y6fD-z7fI-10A9-gVWZkY lrwxrwxrwx 1 root root 10 Aug 2 21:17 /var/lib/ceph/osd/ceph-14/block.db -> /dev/dm-20 I'm just check it out now, so see if there are no errors and that it actually does what I think it does.

9 months, 2 weeks

2
2
0 0

ref v18.2.0 QE Validation status

by Yuri Weinstein

Details of this release are summarized here: https://tracker.ceph.com/issues/62231#note-1 Seeking approvals/reviews for: smoke - Laura, Radek rados - Neha, Radek, Travis, Ernesto, Adam King rgw - Casey fs - Venky orch - Adam King rbd - Ilya krbd - Ilya upgrade-clients:client-upgrade* - in progress powercycle - Brad Please reply to this email with approval and/or trackers of known issues/PRs to address them. bookworm distro support is an outstanding issue. TIA YuriW

9 months, 2 weeks

11
19
0 0

mgr services frequently crash on nodes 2,3,4

by Adiga, Anantha

Hi, Mgr service crash frequently on nodes 2 3 and 4 with the same condition after the 4th node was added. root@zp3110b001a0104:/# ceph crash stat 19 crashes recorded 16 older than 1 days old: 2023-07-29T03:35:32.006309Z_7b622c2b-a2fc-425a-acb8-dc1673b4c189 2023-07-29T03:35:32.055174Z_a2ee1e23-5f41-4dbe-86ff-643fbf870dc9 2023-07-29T14:34:13.752432Z_39b6a0d9-1bc3-4481-9a14-c92fea6c2710 2023-07-30T03:02:57.510867Z_df595e04-0ac2-4e3d-93be-a7225348ea19 2023-07-30T06:20:09.322530Z_0c2485f8-281c-4440-8b08-89b08a669de4 2023-07-30T10:16:46.798405Z_79082f37-ee08-4a2b-84d1-d96c4026f321 2023-07-30T10:16:46.843441Z_788391d6-3278-48c4-a95b-1934ee3265c1 2023-07-31T02:26:55.903966Z_416a1e94-a8e1-4057-a683-a907faf400a1 2023-07-31T04:40:10.216044Z_bef9d811-4e92-45cd-bcd7-3282962c8dfe 2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6 2023-07-31T09:22:12.527968Z_f136c93b-7156-4176-a734-66a5a62513a4 2023-07-31T15:22:08.417988Z_b80c6255-5eb3-41dd-b0b1-8bc5b070094f 2023-07-31T23:05:16.589501Z_20ed8ef9-a478-49de-a371-08ea7a9937e5 2023-08-01T01:26:01.911387Z_670f9e3c-7fbe-497f-9f0b-abeaefd8f2b3 2023-08-01T01:51:39.759874Z_ff8206e4-34aa-44fe-82ac-7339e6714bb7 2023-08-01T01:56:21.955706Z_98c86cdd-45ec-47dc-8f0c-2e5e09731db8 7 older than 3 days old: 2023-07-29T03:35:32.006309Z_7b622c2b-a2fc-425a-acb8-dc1673b4c189 2023-07-29T03:35:32.055174Z_a2ee1e23-5f41-4dbe-86ff-643fbf870dc9 2023-07-29T14:34:13.752432Z_39b6a0d9-1bc3-4481-9a14-c92fea6c2710 2023-07-30T03:02:57.510867Z_df595e04-0ac2-4e3d-93be-a7225348ea19 2023-07-30T06:20:09.322530Z_0c2485f8-281c-4440-8b08-89b08a669de4 2023-07-30T10:16:46.798405Z_79082f37-ee08-4a2b-84d1-d96c4026f321 2023-07-30T10:16:46.843441Z_788391d6-3278-48c4-a95b-1934ee3265c1 root@zp3110b001a0104:/var/lib/ceph/8dbfcd81-fee3-49d2-ac0c-e988c8be7178/crash/posted/2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6#<mailto:root@zp3110b001a0104:/var/lib/ceph/8dbfcd81-fee3-49d2-ac0c-e988c8be7178/crash/posted/2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6#> cat meta { "crash_id": "2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6", "timestamp": "2023-07-31T08:44:20.893344Z", "process_name": "ceph-mgr", "entity_name": "mgr.zp3110b001a0104.tmbkzq", "ceph_version": "16.2.5", "utsname_hostname": "zp3110b001a0104", "utsname_sysname": "Linux", "utsname_release": "5.4.0-153-generic", "utsname_version": "#170-Ubuntu SMP Fri Jun 16 13:43:31 UTC 2023", "utsname_machine": "x86_64", "os_name": "CentOS Linux", "os_id": "centos", "os_version_id": "8", "os_version": "8", "assert_condition": "pending_service_map.epoch > service_map.epoch", "assert_func": "DaemonServer::got_service_map()::<lambda(const ServiceMap&)>", "assert_file": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mgr/DaemonServer.cc", "assert_line": 2932, "assert_thread_name": "ms_dispatch", "assert_msg": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mgr/DaemonServer.cc: In function 'DaemonServer::got_service_map()::<lambda(const ServiceMap&)>' thread 7f127440a700 time 2023-07-31T08:44:20.887150+0000\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mgr/DaemonServer.cc: 2932: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch)\n", "backtrace": [ "/lib64/libpthread.so.0(+0x12b20) [0x7f127c611b20]", "gsignal()", "abort()", "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f127da26b75]", "/usr/lib64/ceph/libceph-common.so.2(+0x276d3e) [0x7f127da26d3e]", "(DaemonServer::got_service_map()+0xb2d) [0x5625aee23a4d]", "(Mgr::handle_service_map(boost::intrusive_ptr<MServiceMap>)+0x1b6) [0x5625aee527c6]", "(Mgr::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x894) [0x5625aee55424]", "(MgrStandby::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0xb0) [0x5625aee5ec10]", "(DispatchQueue::entry()+0x126a) [0x7f127dc610ca]", "(DispatchQueue::DispatchThread::entry()+0x11) [0x7f127dd11591]", "/lib64/libpthread.so.0(+0x814a) [0x7f127c60714a]", "clone()" ] } root@zp3110b001a0104:/var/lib/ceph/8dbfcd81-fee3-49d2-ac0c-e988c8be7178/crash/posted/2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6#<mailto:root@zp3110b001a0104:/var/lib/ceph/8dbfcd81-fee3-49d2-ac0c-e988c8be7178/crash/posted/2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6#> more log --- begin dump of recent events --- -9999> 2023-07-31T08:27:14.084+0000 7f126fc01700 10 monclient: _send_mon_message to mon.zp3110b001a0104 at v2:XX.XXX.26.4:3300/0 -9998> 2023-07-31T08:27:14.216+0000 7f1272406700 10 monclient: tick -9997> 2023-07-31T08:27:14.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:44.220044+0000) -9996> 2023-07-31T08:27:15.216+0000 7f1272406700 10 monclient: tick -9995> 2023-07-31T08:27:15.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:45.220236+0000) -9994> 2023-07-31T08:27:16.108+0000 7f126fc01700 10 monclient: _send_mon_message to mon.zp3110b001a0104 at v2:XX.XXX.26.4:3300/0 -9993> 2023-07-31T08:27:16.216+0000 7f1272406700 10 monclient: tick -9992> 2023-07-31T08:27:16.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:46.220455+0000) -9991> 2023-07-31T08:27:17.216+0000 7f1272406700 10 monclient: tick -9990> 2023-07-31T08:27:17.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:47.220605+0000) -9989> 2023-07-31T08:27:18.132+0000 7f126fc01700 10 monclient: _send_mon_message to mon.zp3110b001a0104 at v2:XX.XXX.26.4:3300/0 -9988> 2023-07-31T08:27:18.216+0000 7f1272406700 10 monclient: tick -9987> 2023-07-31T08:27:18.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:48.220802+0000) -9986> 2023-07-31T08:27:19.216+0000 7f1272406700 10 monclient: tick -9985> 2023-07-31T08:27:19.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:49.220991+0000) -9982> 2023-07-31T08:27:20.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:50.221181+0000) -9981> 2023-07-31T08:27:21.216+0000 7f1272406700 10 monclient: tick -9980> 2023-07-31T08:27:21.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:51.221376+0000) -9979> 2023-07-31T08:27:22.180+0000 7f126fc01700 10 monclient: _send_mon_message to mon.zp3110b001a0104 at v2:XX.XXX.26.4:3300/0 -9978> 2023-07-31T08:27:22.216+0000 7f1272406700 10 monclient: tick -9977> 2023-07-31T08:27:22.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:52.221566+0000) -9976> 2023-07-31T08:27:23.216+0000 7f1272406700 10 monclient: tick -9975> 2023-07-31T08:27:23.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:53.221770+0000) -9974> 2023-07-31T08:27:23.512+0000 7f11664d5700 4 mgr get_config get_config key: mgr/dashboard/AUDIT_API_ENABLED -9973> 2023-07-31T08:27:23.512+0000 7f11664d5700 4 ceph_get_module_option AUDIT_API_ENABLED not found -9972> 2023-07-31T08:27:23.512+0000 7f11664d5700 4 mgr get_config get_config key: mgr/dashboard/standby_behaviour -9971> 2023-07-31T08:27:23.512+0000 7f11664d5700 4 ceph_get_module_option standby_behaviour not found -9970> 2023-07-31T08:27:24.184+0000 7f126fc01700 10 monclient: _send_mon_message to mon.zp3110b001a0104 at v2:XX.XXX.26.4:3300/0 -9969> 2023-07-31T08:27:24.216+0000 7f1272406700 10 monclient: tick -9968> 2023-07-31T08:27:24.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 1T08:26:54.221952+0000) -9967> 2023-07-31T08:27:25.216+0000 7f1272406700 10 monclient: tick -9966> 2023-07-31T08:27:25.216+0000 7f1272406700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2023-07-3 root@zp3110b0

9 months, 2 weeks

3
2
0 0

Disk device path changed - cephadm faild to apply osd service

by Kilian Ries

Hi, it seems that after reboot / OS update my disk labels / device paths may have changed. Since then i get an error like this: CEPHADM_APPLY_SPEC_FAIL: Failed to apply 1 service(s): osd.osd-12-22_hdd-2 ### RuntimeError: cephadm exited with an error code: 1, stderr:Non-zero exit code 1 from /bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:9e2fd45a080aea67d1935d7d9a9025b6db2e8be9173186e068a79a0da5a54ada -e NODE_NAME=ceph-osd07.intern -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=osd-12-22_hdd-2 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/01578d80-6c97-46ba-9327-cb2b13980916:/var/run/ceph:z -v /var/log/ceph/01578d80-6c97-46ba-9327-cb2b13980916:/var/log/ceph:z -v /var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp2cvmr5lf:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpb38cuw7q:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:9e2fd45a080aea67d1935d7d9a9025b6db2e8be9173186e068a79a0da5a54ada lvm batch --no-auto /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq --db-devices /dev/sdg --yes --no-systemd /bin/docker: stderr Traceback (most recent call last): /bin/docker: stderr File "/usr/sbin/ceph-volume", line 11, in <module> /bin/docker: stderr load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')() /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__ /bin/docker: stderr self.main(self.argv) /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc /bin/docker: stderr return f(*a, **kw) /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main /bin/docker: stderr terminal.dispatch(self.mapper, subcommand_args) /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch /bin/docker: stderr instance.main() /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 46, in main /bin/docker: stderr terminal.dispatch(self.mapper, self.argv) /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 192, in dispatch /bin/docker: stderr instance = mapper.get(arg)(argv[count:]) /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 348, in __init__ /bin/docker: stderr self.args = parser.parse_args(argv) /bin/docker: stderr File "/usr/lib64/python3.6/argparse.py", line 1734, in parse_args /bin/docker: stderr args, argv = self.parse_known_args(args, namespace) /bin/docker: stderr File "/usr/lib64/python3.6/argparse.py", line 1766, in parse_known_args /bin/docker: stderr namespace, args = self._parse_known_args(args, namespace) /bin/docker: stderr File "/usr/lib64/python3.6/argparse.py", line 1954, in _parse_known_args /bin/docker: stderr positionals_end_index = consume_positionals(start_index) /bin/docker: stderr File "/usr/lib64/python3.6/argparse.py", line 1931, in consume_positionals /bin/docker: stderr take_action(action, args) /bin/docker: stderr File "/usr/lib64/python3.6/argparse.py", line 1824, in take_action /bin/docker: stderr argument_values = self._get_values(action, argument_strings) /bin/docker: stderr File "/usr/lib64/python3.6/argparse.py", line 2279, in _get_values /bin/docker: stderr value = [self._get_value(action, v) for v in arg_strings] /bin/docker: stderr File "/usr/lib64/python3.6/argparse.py", line 2279, in <listcomp> /bin/docker: stderr value = [self._get_value(action, v) for v in arg_strings] /bin/docker: stderr File "/usr/lib64/python3.6/argparse.py", line 2294, in _get_value /bin/docker: stderr result = type_func(arg_string) /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/arg_validators.py", line 116, in __call__ /bin/docker: stderr return self._format_device(self._is_valid_device()) /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/arg_validators.py", line 127, in _is_valid_device /bin/docker: stderr super()._is_valid_device(raise_sys_exit=False) /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/arg_validators.py", line 104, in _is_valid_device /bin/docker: stderr super()._is_valid_device() /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/arg_validators.py", line 69, in _is_valid_device /bin/docker: stderr super()._is_valid_device() /bin/docker: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/arg_validators.py", line 47, in _is_valid_device /bin/docker: stderr raise RuntimeError("Device {} has partitions.".format(self.dev_path)) /bin/docker: stderr RuntimeError: Device /dev/sdq has partitions. Traceback (most recent call last): File "/var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51", line 9309, in <module> main() File "/var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51", line 9297, in main r = ctx.func(ctx) File "/var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51", line 1941, in _infer_config return func(ctx) File "/var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51", line 1872, in _infer_fsid return func(ctx) File "/var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51", line 1969, in _infer_image return func(ctx) File "/var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51", line 1859, in _validate_fsid return func(ctx) File "/var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51", line 5366, in command_ceph_volume out, err, code = call_throws(ctx, c.run_cmd()) File "/var/lib/ceph/01578d80-6c97-46ba-9327-cb2b13980916/cephadm.0317efb4d3a353d5a77e82f4a4f52582f06970d6aba66473daecf92e26ee3a51", line 1661, in call_throws raise RuntimeError('Failed command: %s' % ' '.join(command)) ### /dev/sdg is my boot device at the moment wich was formerly /dev/sda. Is it safe to edit the ceph orch yaml file and change the device pathes to the new format? Like this: ceph orch ls --service_name=<service-name> --export > myservice.yaml vi (change device pathes in spec -> data_devices -> path | db_devices -> path) ceph orch apply -i myservice.yaml [--dry-run] Is that ok / expected behaviour? Or is there a better way? However can see that mgr detects new devices in ceph orch log: mgr.ceph-mon03.lrfomu [INF] Detected new or changed devices on ceph-osd07 Regards, Kilian

9 months, 3 weeks

3
6
0 0

RHEL / CephFS / Pacific / SELinux unavoidable "relabel inode" error?

by Harry G Coin

Hi! No matter what I try, using the latest cephfs on an all ceph-pacific setup, I've not been able to avoid this error message, always similar to this on RHEL family clients: SELinux: inode=1099954719159 on dev=ceph was found to have an invalid context=system_u:object_r:unlabeled_t:s0. This indicates you may need to relabel the inode or the filesystem in question. What's the answer? Thanks Harry Coin

9 months, 3 weeks

2
1
0 0

ceph-volume lvm migrate error

by Roland Giesler

I need some help with this please. The command below gives and error which is not helpful to me. ceph-volume lvm migrate --osd-id 14 --osd-fsid 4de2a617-4452-420d-a99b-9e0cd6b2a99b --from db wal --target NodeC-nvme1/NodeC-nvme-LV-RocksDB1 --> Source device list is empty Unable to migrate to : NodeC-nvme1/NodeC-nvme-LV-RocksDB1 Alternatively I have tried to only specify --from db instead of including wal, but it makes no difference. Here is the OSD in question. # ls -la /dev/ceph-025b887e-4f06-468f-845c-0ddf9ad04990/ lrwxrwxrwx 1 root root 7 Dec 25 2022 osd-block-4de2a617-4452-420d-a99b-9e0cd6b2a99b -> ../dm-4 What is happening here? I want to move the DB/WAL to NVMe storage without trashing the data OSD and having to go through rebalancing for each drive I do this for. thanks Roland

9 months, 3 weeks

2
1
0 0

Not all Bucket Shards being used

by Christian Kugler

Hi, I have trouble with large OMAP files in a cluster in the RGW index pool. Some background information about the cluster: There is CephFS and RBD usage on the main cluster but for this issue I think only S3 is interesting. There is one realm, one zonegroup with two zones which have a bidirectional sync set up. Since this does not allow for autoresharding we have to do it by hand in this cluster – looking forward to Reef! From the logs: cluster 2023-07-17T22:59:03.018722+0000 osd.75 (osd.75) 623978 : cluster [WRN] Large omap object found. Object: 34:bcec3016:::.dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.5:head PG: 34.680c373d (34.5) Key count: 962091 Size (bytes): 277963182 The offending bucket looks like this: # radosgw-admin bucket stats \ | jq '.[] | select(.marker =="3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9") |"\(.num_shards) \(.usage["rgw.main"].num_objects)"' -r 131 9463833 Last week the number of objects was about 12 million. Which is why I reshareded the offending bucket twice, I think. Once to 129 and the second time to 131 because I wanted some leeway (or lieway? scnr, Sage). Unfortunately, even after a week the objects were still to big (the log line above is quite recent), so I looked into it again. # rados -p raum.rgw.buckets.index ls \ |grep .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9 \ |sort -V .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.0 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.1 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.2 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.3 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.4 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.5 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.6 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.7 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.8 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.9 .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9.10 # rados -p raum.rgw.buckets.index ls \ |grep .dir.3caabb9a-4e3b-4b8a-8222-34c33dd63210.10610190.9 \ |sort -V \ |xargs -IOMAP sh -c \ 'rados -p raum.rgw.buckets.index listomapkeys OMAP | wc -l' 1013854 1011007 1012287 1011232 1013565 998262 1012777 1012713 1012230 1010690 997111 Apparently, only 11 shards are in use. This would explain why the "Key usage" (from the log line) is about ten times higher than I would expect. How can I deal with this issue? One thing I could try to fix this would be to reshard to a lower number, but I am not sure if there are any risks associated with "downsharding". After that I could reshard to something like 97. Or I could directly "downshard" to 97. Also, the second zone has a similar problem, but as the error messsage lets me know, this would be a bad idea. Will it just take more time until the sharding is transferred to the seconds zone? Best, Christian Kugler

9 months, 3 weeks

2
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users August 2023