Hi,
I am currently trying to figure out how to resolve the
"large objects found in pool 'rgw.usage'"
error.
In the past I trimmed the usage log, but now I am at the point that I need
to trim it down to two weeks.
I checked and amount of omapkeys and the distribution is quite off:
# for OBJECT in `rados -p rgw.usage ls`; do
rados -p eu-central-1.rgw.usage listomapkeys ${OBJECT} | wc -l
done
86968
144388
6188
87854
46652
194788
46234
9622
45768
28376
104348
10018
11112
34374
44744
40638
93664
35476
107794
18020
7172
17836
37344
73496
15572
31570
149352
740
113566
35292
5318
442176
Maybe it would be an option to increase this value?rgw usage max user shards
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
Thanks for the insight Eugen.
Here's what basically happened:
- Upgrade from Nautilus to Quincy via migration to new cluster on temp
hardware;
- Data from Nautilus migrated successfully to older / lab-type equipment
running Quincy;
- Nautilus Hardware rebuilt for Quincy, data migrated back;
- As data was migrating we set the older notes to maintenance mode and
started to drain them;
- After several days many OSDs were showing as spinning in "deleting"
status on portal and we were marked OUT;
- This point we made the incorrect assumption those OSDs were no longer
required and proceeded to remove those nodes / OSDs.
I understand Incomplete pages are basically lost. And it's likely a
lengthy task to attempt to salvage data.
Backups will be challenging. I honestly didn't anticipate this kind of
failure with ceph to be possible, we've been using it for several years now
and were encouraged by orchestrator and performance improvements in the 17
code branch.
The fact is of the Incomplete pages that have object counts > 0, there's
about 644 GB of data that's tied up in this mess. There are other
incomplete PGs with object = 0 which I understand can be manually marked as
complete. The cluster has a data usage of 61 TiB. Of this I can
categorize about 14TB of critical data, 40 TB of data that is of medium /
high importance.
There's 14TB in RBD images that would be critical on an EC pool there are
other images, however of lower importance at this point;
There's also about a 20TB CephFS file system of lower data importance as
well.
Question - Can you kindly point me to procedures for:
- Identifying the pools / images / files that are affected by incomplete
pages;
- Extracting and reconstructing data for RBD images (these images are XFS
formatted filesystems);
- Extracting and reconstructing data for CephFS Files not affected by
incomplete PGs.
Much appreciated.
------------------------------
Date: Mon, 09 Jan 2023 10:12:49 +0000
From: Eugen Block <eblock(a)nde.ag>
Subject: [ceph-users] Re: Serious cluster issue - Incomplete PGs
To: ceph-users(a)ceph.io
Message-ID:
<20230109101249.Horde.hAHCWQijFMYLNdX8a2YQDVV(a)webmail.nde.ag>
Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes
Hi,
can you clarify what exactly you did to get into this situation? What
about the undersized PGs, any chance to bring those OSDs back online?
Regarding the incomplete PGs I'm not sure there's much you can do if
the OSDs are lost. To me it reads like you may have
destroyed/recreated more OSDs than you should have, just recreating
OSDs with the same IDs is not sufficient if you destroyed too many
chunks. Each OSD only contains a chunk of the PG due to the erasure
coding. I'm afraid those objects are lost and you would have to
restore from backup. To get the cluster into a healthy state again
there a couple of threads, e. g. [1], but recovering the lost chunks
from ceph will probably not work.
Regards,
Eugen
[1] https://www.mail-archive.com/ceph-users@ceph.io/msg14757.html
Zitat von Deep Dish <deeepdish(a)gmail.com>:
> Hello. I really screwed up my ceph cluster. Hoping to get data off it
> so I can rebuild it.
>
> In summary, too many changes too quickly caused the cluster to develop
> incomplete pgs. Some PGS were reporting that OSDs were to be probes.
> I've created those OSD IDs (empty), however this wouldn't clear
> incompletes. Incompletes are part of EC pools. Running 17.2.5.
>
> This is the overall state:
>
> cluster:
>
> id: 49057622-69fc-11ed-b46e-d5acdedaae33
>
> health: HEALTH_WARN
>
> Failed to apply 1 service(s):
osd.dashboard-admin-1669078094056
>
> 1 hosts fail cephadm check
>
> cephadm background work is paused
>
> Reduced data availability: 28 pgs inactive, 28 pgs incomplete
>
> Degraded data redundancy: 55 pgs undersized
>
> 2 slow ops, oldest one blocked for 4449 sec, daemons
> [osd.25,osd.50,osd.51] have slow ops.
>
>
>
> These are PGs that are incomplete that HAVE DATA (Objects > 0) [ via ceph
> pg ls incomplete ]:
>
> 2.35 23199 0 0 0 95980273664 0
> 0 2477 incomplete 10s 2104'46277 28260:686871
> [44,4,37,3,40,32]p44 [44,4,37,3,40,32]p44
> 2023-01-03T03:54:47.821280+0000 2022-12-29T18:53:09.287203+0000
> 14 queued for deep scrub
> 2.53 22821 0 0 0 94401175552 0
> 0 2745 remapped+incomplete 10s 2104'45845 28260:565267
> [60,48,52,65,67,7]p60 [60]p60
> 2023-01-03T10:18:13.388383+0000 2023-01-03T10:18:13.388383+0000
> 408 queued for scrub
> 2.9f 22858 0 0 0 94555983872 0
> 0 2736 remapped+incomplete 10s 2104'45636 28260:759872
> [56,59,3,57,5,32]p56 [56]p56
> 2023-01-03T10:55:49.848693+0000 2023-01-03T10:55:49.848693+0000
> 376 queued for scrub
> 2.be 22870 0 0 0 94429110272 0
> 0 2661 remapped+incomplete 10s 2104'45561 28260:813759
> [41,31,37,9,7,69]p41 [41]p41
> 2023-01-03T14:02:15.790077+0000 2023-01-03T14:02:15.790077+0000
> 360 queued for scrub
> 2.e4 22953 0 0 0 94912278528 0
> 0 2648 remapped+incomplete 20m 2104'46048 28259:732896
> [37,46,33,4,48,49]p37 [37]p37
> 2023-01-02T18:38:46.268723+0000 2022-12-29T18:05:47.431468+0000
> 18 queued for deep scrub
> 17.78 20169 0 0 0 84517834400 0
> 0 2198 remapped+incomplete 10s 3735'53405 28260:1243673
> [4,37,2,36,66,0]p4 [41]p41
> 2023-01-03T14:21:41.563424+0000 2023-01-03T14:21:41.563424+0000
> 348 queued for scrub
> 17.d8 20328 0 0 0 85196053130 0
> 0 1852 remapped+incomplete 10s 3735'54458 28260:1309564
> [38,65,61,37,58,39]p38 [53]p53
> 2023-01-02T18:32:35.371071+0000 2022-12-28T19:08:29.492244+0000
> 21 queued for deep scrub
>
> At present I'm unable to reliably access my data due to incomplete pages
> above. I'll post whatever outputs requested (won't post now as it can be
> rather verbose). Is there hope?
Hi,
Normally I use rclone to migrate buckets across clusters.
However this time the user has close to 1000 buckets so I wonder what would be the best approach to do this rather buckets by buckets, any idea?
Thank you
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
Running ceph-pacific 16.2.9 using ceph orchestrator.
We made a mistake adding a disk to the cluster and immediately issued a command to remove it using "ceph orch osd rm ### --replace --force".
This OSD no data on it at the time and was removed after just a few minutes. "ceph orch osd rm status" shows that it is still "draining".
ceph osd df shows that the osd being removed has -1 PGs.
So - why is the simple act of removal taking so long and can we abort it and manually remove that osd somehow?
Note: the cluster is also doing a rebalance while this is going on, but the osd being removed never had any data and should not be affected by the rebalance.
thanks!
Hi,
I updated from pacific 16.2.10 to 17.2.5 and the orchestration update went perfectly. Very impressive.
I have one host which then started throwing a cephadm warning after the upgrade.
2023-01-07 11:17:50,080 7f0b26c8ab80 INFO Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45 -e NODE_NAME=kelli.domain.name -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/404b94ab-b4d6-4218-9a4e-ecb8899108ca:/var/run/ceph:z -v /var/log/ceph/404b94ab-b4d6-4218-9a4e-ecb8899108ca:/var/log/ceph:z -v /var/lib/ceph/404b94ab-b4d6-4218-9a4e-ecb8899108ca/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /var/lib/ceph/404b94ab-b4d6-4218-9a4e-ecb8899108ca/selinux:/sys/fs/selinux:ro -v /:/rootfs -v /tmp/ceph-tmpltrnmxf8:/etc/ceph/ceph.conf:z quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45 inventory --format=json-pretty --filter-for-batch
2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr Traceback (most recent call last):
2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/sbin/ceph-volume", line 11, in <module>
2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr self.main(self.argv)
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr return f(*a, **kw)
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr terminal.dispatch(self.mapper, subcommand_args)
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr instance.main()
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/inventory/main.py", line 53, in main
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr with_lsm=self.args.with_lsm))
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 39, in __init__
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr all_devices_vgs = lvm.get_all_devices_vgs()
2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/api/lvm.py", line 797, in get_all_devices_vgs
2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr return [VolumeGroup(**vg) for vg in vgs]
2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/api/lvm.py", line 797, in <listcomp>
2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr return [VolumeGroup(**vg) for vg in vgs]
2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/api/lvm.py", line 517, in __init__
2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr raise ValueError('VolumeGroup must have a non-empty name')
2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr ValueError: VolumeGroup must have a non-empty name
This host is the only one which has 14 drives which aren't being used. I'm guessing this is why its getting this error. The drives may have been used previous in a cluster (maybe not the same cluster) or something. I don't know.
Any suggestions for what to try to get past this issue?
peter
Peter Eisch
DevOps Manager
peter.eisch(a)virginpulse.com
T1.612.445.5135
Confidentiality Notice: This email was sent securely using Transport Layer Security (TLS) Encryption. Please ensure your email systems support TLS before replying with any confidential information. The information contained in this e-mail, including any attachment(s), is intended solely for use by the designated recipient(s). Unauthorized use, dissemination, distribution, or reproduction of this message by anyone other than the intended recipient(s), or a person designated as responsible for delivering such messages to the intended recipient, is strictly prohibited and may be unlawful. This e-mail may contain proprietary, confidential or privileged information. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Virgin Pulse, Inc. If you have received this message in error, or are not the named recipient(s), please immediately notify the sender and delete this e-mail message.
v3.02
Hello team,
I have deployed ceph cluster in production , the cluster composed by two
types of disks HDD and SSD , and the cluster was deployed using
ceph-ansible , unfortunately after deployment the HDD disks appear only
without SSD , would like to restart deployment from scratch , but I miss
the way on how to erase disk to the initial state . try to format disks but
LVM comeback with disks.
sda
8:0 0 7.3T 0 disk
└─ceph--da4a5d58--73ef--473b--9960--371f837cb5ed-osd--block--6e800937--c4d2--4fc9--84ca--083c39d057a8
253:1 0 7.3T 0 lvm
sdb
8:16 0 7.3T 0 disk
└─ceph--773f50a1--79ed--4908--8f81--74f85efeb473-osd--block--9737a046--ba8b--4494--91f7--b80dd894df0b
253:7 0 7.3T 0 lvm
sdc
8:32 0 7.3T 0 disk
└─ceph--02000cec--fdbc--4def--967e--a7c32c851964-osd--block--c54d8182--b5e7--4c73--8d7b--7d24c7a3ce15
253:6 0 7.3T 0 lvm
Kindly help me to sort this out.
Best regards
Michel
Hi,
In a working All-in-one test setup ( where making the bucket public works
from the browser)
radosgw-admin bucket list
[
"711138fc95764303b83002c567ce0972/demo"
]
I have another cluster where openstack and ceph are separate.
I have set same config options in ceph.conf ..
rgw_enable_apis = swift
rgw_keystone_accepted_roles = member, _member_, admin, swiftoperator
rgw_keystone_admin_domain = default
rgw_keystone_admin_password = ****
rgw_keystone_admin_project = service
rgw_keystone_admin_user = ****
rgw_keystone_api_version = 3
rgw_keystone_implicit_tenants = true
rgw_keystone_url = https://<keystone-url>:5000
rgw_swift_account_in_url = true
rgw_swift_versioning_enabled = true
but the output is different
radosgw-admin bucket list
[
"demo",
]
## this is created without the project-uuid.
What is happening is when I make the bucket public, it gives
https://cloud.domain.com:8080/swift/v1/AUTH_b9a4b517525a483a9e111044713bfa1…
-> NoSuchBucket
Please let me know what setting could I be missing so that when the bucket
is created, it is created with the project_id as well and the link works
when the bucket is public.
Thanks
Hello. I really screwed up my ceph cluster. Hoping to get data off it
so I can rebuild it.
In summary, too many changes too quickly caused the cluster to develop
incomplete pgs. Some PGS were reporting that OSDs were to be probes.
I've created those OSD IDs (empty), however this wouldn't clear
incompletes. Incompletes are part of EC pools. Running 17.2.5.
This is the overall state:
cluster:
id: 49057622-69fc-11ed-b46e-d5acdedaae33
health: HEALTH_WARN
Failed to apply 1 service(s): osd.dashboard-admin-1669078094056
1 hosts fail cephadm check
cephadm background work is paused
Reduced data availability: 28 pgs inactive, 28 pgs incomplete
Degraded data redundancy: 55 pgs undersized
2 slow ops, oldest one blocked for 4449 sec, daemons
[osd.25,osd.50,osd.51] have slow ops.
These are PGs that are incomplete that HAVE DATA (Objects > 0) [ via ceph
pg ls incomplete ]:
2.35 23199 0 0 0 95980273664 0
0 2477 incomplete 10s 2104'46277 28260:686871
[44,4,37,3,40,32]p44 [44,4,37,3,40,32]p44
2023-01-03T03:54:47.821280+0000 2022-12-29T18:53:09.287203+0000
14 queued for deep scrub
2.53 22821 0 0 0 94401175552 0
0 2745 remapped+incomplete 10s 2104'45845 28260:565267
[60,48,52,65,67,7]p60 [60]p60
2023-01-03T10:18:13.388383+0000 2023-01-03T10:18:13.388383+0000
408 queued for scrub
2.9f 22858 0 0 0 94555983872 0
0 2736 remapped+incomplete 10s 2104'45636 28260:759872
[56,59,3,57,5,32]p56 [56]p56
2023-01-03T10:55:49.848693+0000 2023-01-03T10:55:49.848693+0000
376 queued for scrub
2.be 22870 0 0 0 94429110272 0
0 2661 remapped+incomplete 10s 2104'45561 28260:813759
[41,31,37,9,7,69]p41 [41]p41
2023-01-03T14:02:15.790077+0000 2023-01-03T14:02:15.790077+0000
360 queued for scrub
2.e4 22953 0 0 0 94912278528 0
0 2648 remapped+incomplete 20m 2104'46048 28259:732896
[37,46,33,4,48,49]p37 [37]p37
2023-01-02T18:38:46.268723+0000 2022-12-29T18:05:47.431468+0000
18 queued for deep scrub
17.78 20169 0 0 0 84517834400 0
0 2198 remapped+incomplete 10s 3735'53405 28260:1243673
[4,37,2,36,66,0]p4 [41]p41
2023-01-03T14:21:41.563424+0000 2023-01-03T14:21:41.563424+0000
348 queued for scrub
17.d8 20328 0 0 0 85196053130 0
0 1852 remapped+incomplete 10s 3735'54458 28260:1309564
[38,65,61,37,58,39]p38 [53]p53
2023-01-02T18:32:35.371071+0000 2022-12-28T19:08:29.492244+0000
21 queued for deep scrub
At present I'm unable to reliably access my data due to incomplete pages
above. I'll post whatever outputs requested (won't post now as it can be
rather verbose). Is there hope?