January 2023 - ceph-users

by Boris Behrens

Hi, I am currently trying to figure out how to resolve the "large objects found in pool 'rgw.usage'" error. In the past I trimmed the usage log, but now I am at the point that I need to trim it down to two weeks. I checked and amount of omapkeys and the distribution is quite off: # for OBJECT in `rados -p rgw.usage ls`; do rados -p eu-central-1.rgw.usage listomapkeys ${OBJECT} | wc -l done 86968 144388 6188 87854 46652 194788 46234 9622 45768 28376 104348 10018 11112 34374 44744 40638 93664 35476 107794 18020 7172 17836 37344 73496 15572 31570 149352 740 113566 35292 5318 442176 Maybe it would be an option to increase this value?rgw usage max user shards -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal.

1 year, 3 months

1
0
0 0

Re: Serious cluster issue - Incomplete PGs

by Deep Dish

Thanks for the insight Eugen. Here's what basically happened: - Upgrade from Nautilus to Quincy via migration to new cluster on temp hardware; - Data from Nautilus migrated successfully to older / lab-type equipment running Quincy; - Nautilus Hardware rebuilt for Quincy, data migrated back; - As data was migrating we set the older notes to maintenance mode and started to drain them; - After several days many OSDs were showing as spinning in "deleting" status on portal and we were marked OUT; - This point we made the incorrect assumption those OSDs were no longer required and proceeded to remove those nodes / OSDs. I understand Incomplete pages are basically lost. And it's likely a lengthy task to attempt to salvage data. Backups will be challenging. I honestly didn't anticipate this kind of failure with ceph to be possible, we've been using it for several years now and were encouraged by orchestrator and performance improvements in the 17 code branch. The fact is of the Incomplete pages that have object counts > 0, there's about 644 GB of data that's tied up in this mess. There are other incomplete PGs with object = 0 which I understand can be manually marked as complete. The cluster has a data usage of 61 TiB. Of this I can categorize about 14TB of critical data, 40 TB of data that is of medium / high importance. There's 14TB in RBD images that would be critical on an EC pool there are other images, however of lower importance at this point; There's also about a 20TB CephFS file system of lower data importance as well. Question - Can you kindly point me to procedures for: - Identifying the pools / images / files that are affected by incomplete pages; - Extracting and reconstructing data for RBD images (these images are XFS formatted filesystems); - Extracting and reconstructing data for CephFS Files not affected by incomplete PGs. Much appreciated. ------------------------------ Date: Mon, 09 Jan 2023 10:12:49 +0000 From: Eugen Block <eblock(a)nde.ag> Subject: [ceph-users] Re: Serious cluster issue - Incomplete PGs To: ceph-users(a)ceph.io Message-ID: <20230109101249.Horde.hAHCWQijFMYLNdX8a2YQDVV(a)webmail.nde.ag> Content-Type: text/plain; charset=utf-8; format=flowed; DelSp=Yes Hi, can you clarify what exactly you did to get into this situation? What about the undersized PGs, any chance to bring those OSDs back online? Regarding the incomplete PGs I'm not sure there's much you can do if the OSDs are lost. To me it reads like you may have destroyed/recreated more OSDs than you should have, just recreating OSDs with the same IDs is not sufficient if you destroyed too many chunks. Each OSD only contains a chunk of the PG due to the erasure coding. I'm afraid those objects are lost and you would have to restore from backup. To get the cluster into a healthy state again there a couple of threads, e. g. [1], but recovering the lost chunks from ceph will probably not work. Regards, Eugen [1] https://www.mail-archive.com/ceph-users@ceph.io/msg14757.html Zitat von Deep Dish <deeepdish(a)gmail.com>: > Hello. I really screwed up my ceph cluster. Hoping to get data off it > so I can rebuild it. > > In summary, too many changes too quickly caused the cluster to develop > incomplete pgs. Some PGS were reporting that OSDs were to be probes. > I've created those OSD IDs (empty), however this wouldn't clear > incompletes. Incompletes are part of EC pools. Running 17.2.5. > > This is the overall state: > > cluster: > > id: 49057622-69fc-11ed-b46e-d5acdedaae33 > > health: HEALTH_WARN > > Failed to apply 1 service(s): osd.dashboard-admin-1669078094056 > > 1 hosts fail cephadm check > > cephadm background work is paused > > Reduced data availability: 28 pgs inactive, 28 pgs incomplete > > Degraded data redundancy: 55 pgs undersized > > 2 slow ops, oldest one blocked for 4449 sec, daemons > [osd.25,osd.50,osd.51] have slow ops. > > > > These are PGs that are incomplete that HAVE DATA (Objects > 0) [ via ceph > pg ls incomplete ]: > > 2.35 23199 0 0 0 95980273664 0 > 0 2477 incomplete 10s 2104'46277 28260:686871 > [44,4,37,3,40,32]p44 [44,4,37,3,40,32]p44 > 2023-01-03T03:54:47.821280+0000 2022-12-29T18:53:09.287203+0000 > 14 queued for deep scrub > 2.53 22821 0 0 0 94401175552 0 > 0 2745 remapped+incomplete 10s 2104'45845 28260:565267 > [60,48,52,65,67,7]p60 [60]p60 > 2023-01-03T10:18:13.388383+0000 2023-01-03T10:18:13.388383+0000 > 408 queued for scrub > 2.9f 22858 0 0 0 94555983872 0 > 0 2736 remapped+incomplete 10s 2104'45636 28260:759872 > [56,59,3,57,5,32]p56 [56]p56 > 2023-01-03T10:55:49.848693+0000 2023-01-03T10:55:49.848693+0000 > 376 queued for scrub > 2.be 22870 0 0 0 94429110272 0 > 0 2661 remapped+incomplete 10s 2104'45561 28260:813759 > [41,31,37,9,7,69]p41 [41]p41 > 2023-01-03T14:02:15.790077+0000 2023-01-03T14:02:15.790077+0000 > 360 queued for scrub > 2.e4 22953 0 0 0 94912278528 0 > 0 2648 remapped+incomplete 20m 2104'46048 28259:732896 > [37,46,33,4,48,49]p37 [37]p37 > 2023-01-02T18:38:46.268723+0000 2022-12-29T18:05:47.431468+0000 > 18 queued for deep scrub > 17.78 20169 0 0 0 84517834400 0 > 0 2198 remapped+incomplete 10s 3735'53405 28260:1243673 > [4,37,2,36,66,0]p4 [41]p41 > 2023-01-03T14:21:41.563424+0000 2023-01-03T14:21:41.563424+0000 > 348 queued for scrub > 17.d8 20328 0 0 0 85196053130 0 > 0 1852 remapped+incomplete 10s 3735'54458 28260:1309564 > [38,65,61,37,58,39]p38 [53]p53 > 2023-01-02T18:32:35.371071+0000 2022-12-28T19:08:29.492244+0000 > 21 queued for deep scrub > > At present I'm unable to reliably access my data due to incomplete pages > above. I'll post whatever outputs requested (won't post now as it can be > rather verbose). Is there hope?

1 year, 3 months

2
1
0 0

Re: 1 pg recovery_unfound after multiple crash of an OSD

by Kai Stian Olstad

Hi Just a follow up, the issue was solved by running command ceph pg 404.1ff mark_unfound_lost delete - Kai Stian Olstad On 04.01.2023 13:00, Kai Stian Olstad wrote: > Hi > > We are running Ceph 16.2.6 deployed with Cephadm. > > Around Christmas OSD 245 and 327 had about 20 read error so I set them > to out. > > Around new year another OSD 313 more or less died since is become so > slow that it triggered Linux default I/O-timeout of 30 seconds. > In this period the OSD crashed 8 times and was restartet by Systemd > and we ended up with > > [WRN] OBJECT_UNFOUND: 1/416287126 objects unfound (0.000%) > pg 404.1ff has 1 unfound objects > [ERR] PG_DAMAGED: Possible data damage: 1 pg recovery_unfound > pg 404.1ff is active+recovery_unfound+degraded+remapped, acting > [208,220,269,175,313,329], 1 unfound > [WRN] PG_DEGRADED: Degraded data redundancy: 5/2364745884 objects > degraded (0.000%), 1 pg degraded > pg 404.1ff is active+recovery_unfound+degraded+remapped, acting > [208,220,269,175,313,329], 1 unfound > > The pool 404 is "default.rgw.buckets.data" and pool 404 is erasure > encoding 4+2. > > I have search for a solution but with no luck, what I have tried is > > - Restarted all 6 OSD for the PG one by one > - Running repair of 404.1ff > > Output of following command > - ceph -s > - ceph health detail > - ceph pg ls | grep -e PG -e ^404.1ff > - ceph osd pool ls detail | grep 404 > - ceph osd tree out > - ceph crash ls | grep -e ID -e osd.313 > - ceph pg 404.1ff list_unfound > - ceph pg 404.1ff > > Is appended below, can also be read here > https://gitlab.com/-/snippets/2479624 > or cloned with "git clone https://gitlab.com/-/snippets/2479624" > > Does anyone have any idea on how to resolv the problem? > Any help is much appreciated. > > - > Kai Stian Olstad > > > > :::::::::::::: > ceph-s.txt > :::::::::::::: > ceph -s > ------- > cluster: > id: d13c6b81-51ee-4d22-84e9-456f9307296c > health: HEALTH_ERR > 1/416287125 objects unfound (0.000%) > Possible data damage: 1 pg recovery_unfound > Degraded data redundancy: 5/2364745860 objects degraded > (0.000%), 1 pg degraded > > services: > mon: 3 daemons, quorum ceph-mon-1,ceph-mon-2,ceph-mon-3 (age 2M) > mgr: ceph-mon-2.mfdanx(active, since 3w), standbys: > ceph-mon-1.ptrsea > mds: 1/1 daemons up, 1 standby > osd: 355 osds: 355 up (since 20h), 352 in (since 2d); 1 remapped > pgs > rgw: 4 daemons active (4 hosts, 1 zones) > > data: > volumes: 1/1 healthy > pools: 14 pools, 2505 pgs > objects: 416.29M objects, 540 TiB > usage: 939 TiB used, 2.1 PiB / 3.0 PiB avail > pgs: 5/2364745860 objects degraded (0.000%) > 137931/2364745860 objects misplaced (0.006%) > 1/416287125 objects unfound (0.000%) > 2489 active+clean > 14 active+clean+scrubbing+deep > 1 active+recovery_unfound+degraded+remapped > 1 active+clean+scrubbing > > io: > client: 38 MiB/s rd, 23 MiB/s wr, 2.58k op/s rd, 326 op/s wr > > progress: > Global Recovery Event (6d) > [===========================.] (remaining: 3m) > > > :::::::::::::: > ceph_health_detail.txt > :::::::::::::: > ceph health detail > ------------------ > HEALTH_ERR 1/416287126 objects unfound (0.000%); Possible data damage: > 1 pg recovery_unfound; Degraded data redundancy: 5/2364745884 objects > degraded (0.000%), 1 pg degraded > [WRN] OBJECT_UNFOUND: 1/416287126 objects unfound (0.000%) > pg 404.1ff has 1 unfound objects > [ERR] PG_DAMAGED: Possible data damage: 1 pg recovery_unfound > pg 404.1ff is active+recovery_unfound+degraded+remapped, acting > [208,220,269,175,313,329], 1 unfound > [WRN] PG_DEGRADED: Degraded data redundancy: 5/2364745884 objects > degraded (0.000%), 1 pg degraded > pg 404.1ff is active+recovery_unfound+degraded+remapped, acting > [208,220,269,175,313,329], 1 unfound > > > :::::::::::::: > ceph_pg_ls.txt > :::::::::::::: > ceph pg ls | grep -e PG -e ^404.1ff > ----------------------------------- > PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES > OMAP_BYTES* OMAP_KEYS* LOG STATE > SINCE VERSION REPORTED UP > ACTING SCRUB_STAMP > DEEP_SCRUB_STAMP > 404.1ff 137912 5 137908 1 282417561722 > 0 0 5528 active+recovery_unfound+degraded+remapped > 19h 141748'724163 141748:3558203 [208,220,269,175,343,329]p208 > [208,220,269,175,313,329]p208 2022-12-31T19:27:10.993286+0000 > 2022-12-31T19:27:10.993286+0000 > > > :::::::::::::: > ceph_osd_pool_ls_detail.txt > :::::::::::::: > ceph osd pool ls detail | grep 404 > ---------------------------------- > pool 404 'default.rgw.buckets.data' erasure profile > ec42-jerasure-blaum_roth-hdd size 6 min_size 5 crush_rule 2 > object_hash rjenkins pg_num 2048 pgp_num 2048 autoscale_mode on > last_change 124077 lfor 0/52091/108555 flags hashpspool stripe_width > 229376 target_size_bytes 1099511627776000 application rgw > > > :::::::::::::: > ceph_osd_tree_out.txt > :::::::::::::: > ceph osd tree out > ----------------- > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 3112.43481 root default > -58 192.35847 host ceph-hd-010 > 245 hdd 12.82390 osd.245 up 0 1.00000 > -79 192.35847 host ceph-hd-014 > 313 hdd 12.82390 osd.313 up 0 1.00000 > 327 hdd 12.82390 osd.327 up 0 1.00000 > > > :::::::::::::: > ceph_crash_ls.txt > :::::::::::::: > ceph crash ls | grep -e ID -e osd.313 > ------------------------------------- > ID > ENTITY NEW > 2022-12-31T01:19:06.859805Z_8e94201c-8817-4c4e-b4dd-67c5b1d3a73a > osd.313 > 2022-12-31T01:19:47.265925Z_551b2de6-d1b0-4cbc-9e62-27ede7a46803 > osd.313 > 2022-12-31T01:20:18.648279Z_64f232ec-2cb5-4dcf-8686-b12987ed7e45 > osd.313 > 2022-12-31T20:28:43.093203Z_9c9a590f-7490-4f74-affe-5bda27b87618 > osd.313 > 2023-01-02T01:34:25.087309Z_b7a4b341-5f79-467b-a4e2-3df5456a2ff7 > osd.313 > 2023-01-02T03:46:32.295053Z_d71b5b62-5974-4b3e-b1c5-cc878d376f06 > osd.313 > 2023-01-02T10:58:58.350473Z_ca87b412-bae5-4873-9997-092587d7c699 > osd.313 > 2023-01-02T11:42:13.433011Z_c0e9ab85-a162-4331-b690-68847c641bd0 > osd.313 > > > :::::::::::::: > ceph_pg_404.1ff_list_unfound.json > :::::::::::::: > { > "num_missing": 1, > "num_unfound": 1, > "objects": [ > { > "oid": { > "oid": > "1f244892-a2e7-406b-aa62-1b13511333a2.902205.1__multipart_ingest/date=2022-12-30/2022-12-30.json.gz.2~pORMz8G-kyQx8weY8GhNt-pgAmQDufG.684", > "key": "", > "snapid": -2, > "hash": 1069593087, > "max": 0, > "pool": 404, > "namespace": "" > }, > "need": "140483'722076", > "have": "0'0", > "flags": "none", > "clean_regions": "clean_offsets: [], clean_omap: 0, > new_object: 1", > "locations": [ > "329(5)" > ] > } > ], > "state": "NotRecovering", > "available_might_have_unfound": true, > "might_have_unfound": [], > "more": false > } > > > :::::::::::::: > ceph_pg_404.1ff.json > :::::::::::::: > { > "snap_trimq": "[]", > "snap_trimq_len": 0, > "state": "active+recovery_unfound+degraded+remapped", > "epoch": 141748, > "up": [ > 208, > 220, > 269, > 175, > 343, > 329 > ], > "acting": [ > 208, > 220, > 269, > 175, > 313, > 329 > ], > "backfill_targets": [ > "343(4)" > ], > "acting_recovery_backfill": [ > "175(3)", > "208(0)", > "220(1)", > "269(2)", > "313(4)", > "329(5)", > "343(4)" > ], > "info": { > "pgid": "404.1ffs0", > "last_update": "141747'724161", > "last_complete": "140483'722075", > "log_tail": "138455'718635", > "last_user_version": 724161, > "last_backfill": "MAX", > "purged_snaps": [], > "history": { > "epoch_created": 36041, > "epoch_pool_created": 27862, > "last_epoch_started": 141660, > "last_interval_started": 141659, > "last_epoch_clean": 140452, > "last_interval_clean": 140451, > "last_epoch_split": 108548, > "last_epoch_marked_full": 0, > "same_up_since": 141659, > "same_interval_since": 141659, > "same_primary_since": 141615, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "prior_readable_until_ub": 0 > }, > "stats": { > "version": "141747'724161", > "reported_seq": 3558199, > "reported_epoch": 141748, > "state": "active+recovery_unfound+degraded+remapped", > "last_fresh": "2023-01-04T07:48:25.201886+0000", > "last_change": "2023-01-03T12:15:50.155062+0000", > "last_active": "2023-01-04T07:48:25.201886+0000", > "last_peered": "2023-01-04T07:48:25.201886+0000", > "last_clean": "2022-12-31T19:25:42.758665+0000", > "last_became_active": "2023-01-03T12:15:49.875672+0000", > "last_became_peered": "2023-01-03T12:15:49.875672+0000", > "last_unstale": "2023-01-04T07:48:25.201886+0000", > "last_undegraded": "2023-01-03T12:15:49.758975+0000", > "last_fullsized": "2023-01-04T07:48:25.201886+0000", > "mapping_epoch": 141659, > "log_start": "138455'718635", > "ondisk_log_start": "138455'718635", > "created": 36041, > "last_epoch_clean": 140452, > "parent": "0.0", > "parent_split_bits": 10, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "log_size": 5526, > "ondisk_log_size": 5526, > "stats_invalid": false, > "dirty_stats_invalid": false, > "omap_stats_invalid": false, > "hitset_stats_invalid": false, > "hitset_bytes_stats_invalid": false, > "pin_stats_invalid": false, > "manifest_stats_invalid": false, > "snaptrimq_len": 0, > "stat_sum": { > "num_bytes": 282409304186, > "num_objects": 137910, > "num_object_clones": 0, > "num_object_copies": 827460, > "num_objects_missing_on_primary": 1, > "num_objects_missing": 1, > "num_objects_degraded": 5, > "num_objects_misplaced": 137906, > "num_objects_unfound": 1, > "num_objects_dirty": 137910, > "num_whiteouts": 0, > "num_read": 732600, > "num_read_kb": 181255155, > "num_write": 471445, > "num_write_kb": 102310260, > "num_scrub_errors": 0, > "num_shallow_scrub_errors": 0, > "num_deep_scrub_errors": 0, > "num_objects_recovered": 406583, > "num_bytes_recovered": 891273764570, > "num_keys_recovered": 0, > "num_objects_omap": 0, > "num_objects_hit_set_archive": 0, > "num_bytes_hit_set_archive": 0, > "num_flush": 0, > "num_flush_kb": 0, > "num_evict": 0, > "num_evict_kb": 0, > "num_promote": 0, > "num_flush_mode_high": 0, > "num_flush_mode_low": 0, > "num_evict_mode_some": 0, > "num_evict_mode_full": 0, > "num_objects_pinned": 0, > "num_legacy_snapsets": 0, > "num_large_omap_objects": 0, > "num_objects_manifest": 0, > "num_omap_bytes": 0, > "num_omap_keys": 0, > "num_objects_repaired": 3 > }, > "up": [ > 208, > 220, > 269, > 175, > 343, > 329 > ], > "acting": [ > 208, > 220, > 269, > 175, > 313, > 329 > ], > "avail_no_missing": [ > "329(5)" > ], > "object_location_counts": [ > { > "shards": > "175(3),208(0),220(1),269(2),313(4),329(5)", > "objects": 137909 > }, > { > "shards": "329(5)", > "objects": 1 > } > ], > "blocked_by": [], > "up_primary": 208, > "acting_primary": 208, > "purged_snaps": [] > }, > "empty": 0, > "dne": 0, > "incomplete": 0, > "last_epoch_started": 141660, > "hit_set_history": { > "current_last_update": "0'0", > "history": [] > } > }, > "peer_info": [ > { > "peer": "175(3)", > "pgid": "404.1ffs3", > "last_update": "141747'724161", > "last_complete": "141668'723872", > "log_tail": "138378'718235", > "last_user_version": 723769, > "last_backfill": "MAX", > "purged_snaps": [], > "history": { > "epoch_created": 36041, > "epoch_pool_created": 27862, > "last_epoch_started": 141660, > "last_interval_started": 141659, > "last_epoch_clean": 140452, > "last_interval_clean": 140451, > "last_epoch_split": 108548, > "last_epoch_marked_full": 0, > "same_up_since": 141659, > "same_interval_since": 141659, > "same_primary_since": 141615, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "prior_readable_until_ub": 0 > }, > "stats": { > "version": "141624'723769", > "reported_seq": 3550912, > "reported_epoch": 141624, > "state": "active+recovery_unfound+degraded+remapped", > "last_fresh": "2023-01-03T11:57:15.480994+0000", > "last_change": "2023-01-03T11:31:07.335605+0000", > "last_active": "2023-01-03T11:57:15.480994+0000", > "last_peered": "2023-01-03T11:57:15.480994+0000", > "last_clean": "2022-12-31T19:25:42.758665+0000", > "last_became_active": > "2023-01-03T11:31:07.116154+0000", > "last_became_peered": > "2023-01-03T11:31:07.116154+0000", > "last_unstale": "2023-01-03T11:57:15.480994+0000", > "last_undegraded": "2023-01-03T11:31:07.099558+0000", > "last_fullsized": "2023-01-03T11:57:15.480994+0000", > "mapping_epoch": 141659, > "log_start": "138378'718235", > "ondisk_log_start": "138378'718235", > "created": 36041, > "last_epoch_clean": 140452, > "parent": "0.0", > "parent_split_bits": 10, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "log_size": 5534, > "ondisk_log_size": 5534, > "stats_invalid": false, > "dirty_stats_invalid": false, > "omap_stats_invalid": false, > "hitset_stats_invalid": false, > "hitset_bytes_stats_invalid": false, > "pin_stats_invalid": false, > "manifest_stats_invalid": false, > "snaptrimq_len": 0, > "stat_sum": { > "num_bytes": 281850993538, > "num_objects": 137711, > "num_object_clones": 0, > "num_object_copies": 826266, > "num_objects_missing_on_primary": 1, > "num_objects_missing": 1, > "num_objects_degraded": 5, > "num_objects_misplaced": 137707, > "num_objects_unfound": 1, > "num_objects_dirty": 137711, > "num_whiteouts": 0, > "num_read": 720741, > "num_read_kb": 180044133, > "num_write": 469883, > "num_write_kb": 101652239, > "num_scrub_errors": 0, > "num_shallow_scrub_errors": 0, > "num_deep_scrub_errors": 0, > "num_objects_recovered": 406583, > "num_bytes_recovered": 891273764570, > "num_keys_recovered": 0, > "num_objects_omap": 0, > "num_objects_hit_set_archive": 0, > "num_bytes_hit_set_archive": 0, > "num_flush": 0, > "num_flush_kb": 0, > "num_evict": 0, > "num_evict_kb": 0, > "num_promote": 0, > "num_flush_mode_high": 0, > "num_flush_mode_low": 0, > "num_evict_mode_some": 0, > "num_evict_mode_full": 0, > "num_objects_pinned": 0, > "num_legacy_snapsets": 0, > "num_large_omap_objects": 0, > "num_objects_manifest": 0, > "num_omap_bytes": 0, > "num_omap_keys": 0, > "num_objects_repaired": 3 > }, > "up": [ > 208, > 220, > 269, > 175, > 343, > 329 > ], > "acting": [ > 208, > 220, > 269, > 175, > 313, > 329 > ], > "avail_no_missing": [ > "329(5)" > ], > "object_location_counts": [ > { > "shards": > "175(3),208(0),220(1),269(2),313(4),329(5)", > "objects": 137710 > }, > { > "shards": "329(5)", > "objects": 1 > } > ], > "blocked_by": [], > "up_primary": 208, > "acting_primary": 208, > "purged_snaps": [] > }, > "empty": 0, > "dne": 0, > "incomplete": 0, > "last_epoch_started": 141660, > "hit_set_history": { > "current_last_update": "0'0", > "history": [] > } > }, > { > "peer": "220(1)", > "pgid": "404.1ffs1", > "last_update": "141747'724161", > "last_complete": "141668'723872", > "log_tail": "138378'718235", > "last_user_version": 723769, > "last_backfill": "MAX", > "purged_snaps": [], > "history": { > "epoch_created": 36041, > "epoch_pool_created": 27862, > "last_epoch_started": 141660, > "last_interval_started": 141659, > "last_epoch_clean": 140452, > "last_interval_clean": 140451, > "last_epoch_split": 108548, > "last_epoch_marked_full": 0, > "same_up_since": 141659, > "same_interval_since": 141659, > "same_primary_since": 141615, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "prior_readable_until_ub": 0 > }, > "stats": { > "version": "141624'723769", > "reported_seq": 3550912, > "reported_epoch": 141624, > "state": "active+recovery_unfound+degraded+remapped", > "last_fresh": "2023-01-03T11:57:15.480994+0000", > "last_change": "2023-01-03T11:31:07.335605+0000", > "last_active": "2023-01-03T11:57:15.480994+0000", > "last_peered": "2023-01-03T11:57:15.480994+0000", > "last_clean": "2022-12-31T19:25:42.758665+0000", > "last_became_active": > "2023-01-03T11:31:07.116154+0000", > "last_became_peered": > "2023-01-03T11:31:07.116154+0000", > "last_unstale": "2023-01-03T11:57:15.480994+0000", > "last_undegraded": "2023-01-03T11:31:07.099558+0000", > "last_fullsized": "2023-01-03T11:57:15.480994+0000", > "mapping_epoch": 141659, > "log_start": "138378'718235", > "ondisk_log_start": "138378'718235", > "created": 36041, > "last_epoch_clean": 140452, > "parent": "0.0", > "parent_split_bits": 10, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "log_size": 5534, > "ondisk_log_size": 5534, > "stats_invalid": false, > "dirty_stats_invalid": false, > "omap_stats_invalid": false, > "hitset_stats_invalid": false, > "hitset_bytes_stats_invalid": false, > "pin_stats_invalid": false, > "manifest_stats_invalid": false, > "snaptrimq_len": 0, > "stat_sum": { > "num_bytes": 281850993538, > "num_objects": 137711, > "num_object_clones": 0, > "num_object_copies": 826266, > "num_objects_missing_on_primary": 1, > "num_objects_missing": 1, > "num_objects_degraded": 5, > "num_objects_misplaced": 137707, > "num_objects_unfound": 1, > "num_objects_dirty": 137711, > "num_whiteouts": 0, > "num_read": 720741, > "num_read_kb": 180044133, > "num_write": 469883, > "num_write_kb": 101652239, > "num_scrub_errors": 0, > "num_shallow_scrub_errors": 0, > "num_deep_scrub_errors": 0, > "num_objects_recovered": 406583, > "num_bytes_recovered": 891273764570, > "num_keys_recovered": 0, > "num_objects_omap": 0, > "num_objects_hit_set_archive": 0, > "num_bytes_hit_set_archive": 0, > "num_flush": 0, > "num_flush_kb": 0, > "num_evict": 0, > "num_evict_kb": 0, > "num_promote": 0, > "num_flush_mode_high": 0, > "num_flush_mode_low": 0, > "num_evict_mode_some": 0, > "num_evict_mode_full": 0, > "num_objects_pinned": 0, > "num_legacy_snapsets": 0, > "num_large_omap_objects": 0, > "num_objects_manifest": 0, > "num_omap_bytes": 0, > "num_omap_keys": 0, > "num_objects_repaired": 3 > }, > "up": [ > 208, > 220, > 269, > 175, > 343, > 329 > ], > "acting": [ > 208, > 220, > 269, > 175, > 313, > 329 > ], > "avail_no_missing": [ > "329(5)" > ], > "object_location_counts": [ > { > "shards": > "175(3),208(0),220(1),269(2),313(4),329(5)", > "objects": 137710 > }, > { > "shards": "329(5)", > "objects": 1 > } > ], > "blocked_by": [], > "up_primary": 208, > "acting_primary": 208, > "purged_snaps": [] > }, > "empty": 0, > "dne": 0, > "incomplete": 0, > "last_epoch_started": 141660, > "hit_set_history": { > "current_last_update": "0'0", > "history": [] > } > }, > { > "peer": "269(2)", > "pgid": "404.1ffs2", > "last_update": "141747'724161", > "last_complete": "141668'723872", > "log_tail": "138378'718235", > "last_user_version": 723769, > "last_backfill": "MAX", > "purged_snaps": [], > "history": { > "epoch_created": 36041, > "epoch_pool_created": 27862, > "last_epoch_started": 141660, > "last_interval_started": 141659, > "last_epoch_clean": 140452, > "last_interval_clean": 140451, > "last_epoch_split": 108548, > "last_epoch_marked_full": 0, > "same_up_since": 141659, > "same_interval_since": 141659, > "same_primary_since": 141615, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "prior_readable_until_ub": 0 > }, > "stats": { > "version": "141624'723769", > "reported_seq": 3550912, > "reported_epoch": 141624, > "state": "active+recovery_unfound+degraded+remapped", > "last_fresh": "2023-01-03T11:57:15.480994+0000", > "last_change": "2023-01-03T11:31:07.335605+0000", > "last_active": "2023-01-03T11:57:15.480994+0000", > "last_peered": "2023-01-03T11:57:15.480994+0000", > "last_clean": "2022-12-31T19:25:42.758665+0000", > "last_became_active": > "2023-01-03T11:31:07.116154+0000", > "last_became_peered": > "2023-01-03T11:31:07.116154+0000", > "last_unstale": "2023-01-03T11:57:15.480994+0000", > "last_undegraded": "2023-01-03T11:31:07.099558+0000", > "last_fullsized": "2023-01-03T11:57:15.480994+0000", > "mapping_epoch": 141659, > "log_start": "138378'718235", > "ondisk_log_start": "138378'718235", > "created": 36041, > "last_epoch_clean": 140452, > "parent": "0.0", > "parent_split_bits": 10, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "log_size": 5534, > "ondisk_log_size": 5534, > "stats_invalid": false, > "dirty_stats_invalid": false, > "omap_stats_invalid": false, > "hitset_stats_invalid": false, > "hitset_bytes_stats_invalid": false, > "pin_stats_invalid": false, > "manifest_stats_invalid": false, > "snaptrimq_len": 0, > "stat_sum": { > "num_bytes": 281850993538, > "num_objects": 137711, > "num_object_clones": 0, > "num_object_copies": 826266, > "num_objects_missing_on_primary": 1, > "num_objects_missing": 1, > "num_objects_degraded": 5, > "num_objects_misplaced": 137707, > "num_objects_unfound": 1, > "num_objects_dirty": 137711, > "num_whiteouts": 0, > "num_read": 720741, > "num_read_kb": 180044133, > "num_write": 469883, > "num_write_kb": 101652239, > "num_scrub_errors": 0, > "num_shallow_scrub_errors": 0, > "num_deep_scrub_errors": 0, > "num_objects_recovered": 406583, > "num_bytes_recovered": 891273764570, > "num_keys_recovered": 0, > "num_objects_omap": 0, > "num_objects_hit_set_archive": 0, > "num_bytes_hit_set_archive": 0, > "num_flush": 0, > "num_flush_kb": 0, > "num_evict": 0, > "num_evict_kb": 0, > "num_promote": 0, > "num_flush_mode_high": 0, > "num_flush_mode_low": 0, > "num_evict_mode_some": 0, > "num_evict_mode_full": 0, > "num_objects_pinned": 0, > "num_legacy_snapsets": 0, > "num_large_omap_objects": 0, > "num_objects_manifest": 0, > "num_omap_bytes": 0, > "num_omap_keys": 0, > "num_objects_repaired": 3 > }, > "up": [ > 208, > 220, > 269, > 175, > 343, > 329 > ], > "acting": [ > 208, > 220, > 269, > 175, > 313, > 329 > ], > "avail_no_missing": [ > "329(5)" > ], > "object_location_counts": [ > { > "shards": > "175(3),208(0),220(1),269(2),313(4),329(5)", > "objects": 137710 > }, > { > "shards": "329(5)", > "objects": 1 > } > ], > "blocked_by": [], > "up_primary": 208, > "acting_primary": 208, > "purged_snaps": [] > }, > "empty": 0, > "dne": 0, > "incomplete": 0, > "last_epoch_started": 141660, > "hit_set_history": { > "current_last_update": "0'0", > "history": [] > } > }, > { > "peer": "313(4)", > "pgid": "404.1ffs4", > "last_update": "141747'724161", > "last_complete": "141668'723872", > "log_tail": "138228'717435", > "last_user_version": 723769, > "last_backfill": "MAX", > "purged_snaps": [], > "history": { > "epoch_created": 36041, > "epoch_pool_created": 27862, > "last_epoch_started": 141660, > "last_interval_started": 141659, > "last_epoch_clean": 140452, > "last_interval_clean": 140451, > "last_epoch_split": 108548, > "last_epoch_marked_full": 0, > "same_up_since": 141659, > "same_interval_since": 141659, > "same_primary_since": 141615, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "prior_readable_until_ub": 0 > }, > "stats": { > "version": "141624'723769", > "reported_seq": 3551314, > "reported_epoch": 141659, > "state": "active+recovery_unfound+degraded+remapped", > "last_fresh": "2023-01-03T11:57:15.480994+0000", > "last_change": "2023-01-03T11:31:07.335605+0000", > "last_active": "2023-01-03T11:57:15.480994+0000", > "last_peered": "2023-01-03T11:57:15.480994+0000", > "last_clean": "2022-12-31T19:25:42.758665+0000", > "last_became_active": > "2023-01-03T11:31:07.116154+0000", > "last_became_peered": > "2023-01-03T11:31:07.116154+0000", > "last_unstale": "2023-01-03T11:57:15.480994+0000", > "last_undegraded": "2023-01-03T11:31:07.099558+0000", > "last_fullsized": "2023-01-03T11:57:15.480994+0000", > "mapping_epoch": 141659, > "log_start": "138378'718235", > "ondisk_log_start": "138378'718235", > "created": 36041, > "last_epoch_clean": 140452, > "parent": "0.0", > "parent_split_bits": 10, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "log_size": 5534, > "ondisk_log_size": 5534, > "stats_invalid": false, > "dirty_stats_invalid": false, > "omap_stats_invalid": false, > "hitset_stats_invalid": false, > "hitset_bytes_stats_invalid": false, > "pin_stats_invalid": false, > "manifest_stats_invalid": false, > "snaptrimq_len": 0, > "stat_sum": { > "num_bytes": 281850993538, > "num_objects": 137711, > "num_object_clones": 0, > "num_object_copies": 826266, > "num_objects_missing_on_primary": 1, > "num_objects_missing": 1, > "num_objects_degraded": 5, > "num_objects_misplaced": 137707, > "num_objects_unfound": 1, > "num_objects_dirty": 137711, > "num_whiteouts": 0, > "num_read": 720741, > "num_read_kb": 180044133, > "num_write": 469883, > "num_write_kb": 101652239, > "num_scrub_errors": 0, > "num_shallow_scrub_errors": 0, > "num_deep_scrub_errors": 0, > "num_objects_recovered": 406583, > "num_bytes_recovered": 891273764570, > "num_keys_recovered": 0, > "num_objects_omap": 0, > "num_objects_hit_set_archive": 0, > "num_bytes_hit_set_archive": 0, > "num_flush": 0, > "num_flush_kb": 0, > "num_evict": 0, > "num_evict_kb": 0, > "num_promote": 0, > "num_flush_mode_high": 0, > "num_flush_mode_low": 0, > "num_evict_mode_some": 0, > "num_evict_mode_full": 0, > "num_objects_pinned": 0, > "num_legacy_snapsets": 0, > "num_large_omap_objects": 0, > "num_objects_manifest": 0, > "num_omap_bytes": 0, > "num_omap_keys": 0, > "num_objects_repaired": 3 > }, > "up": [ > 208, > 220, > 269, > 175, > 343, > 329 > ], > "acting": [ > 208, > 220, > 269, > 175, > 313, > 329 > ], > "avail_no_missing": [ > "329(5)" > ], > "object_location_counts": [ > { > "shards": > "175(3),208(0),220(1),269(2),313(4),329(5)", > "objects": 137710 > }, > { > "shards": "329(5)", > "objects": 1 > } > ], > "blocked_by": [], > "up_primary": 208, > "acting_primary": 208, > "purged_snaps": [] > }, > "empty": 0, > "dne": 0, > "incomplete": 0, > "last_epoch_started": 141660, > "hit_set_history": { > "current_last_update": "0'0", > "history": [] > } > }, > { > "peer": "329(5)", > "pgid": "404.1ffs5", > "last_update": "141747'724161", > "last_complete": "141668'723872", > "log_tail": "138378'718235", > "last_user_version": 723769, > "last_backfill": "MAX", > "purged_snaps": [], > "history": { > "epoch_created": 36041, > "epoch_pool_created": 27862, > "last_epoch_started": 141660, > "last_interval_started": 141659, > "last_epoch_clean": 140452, > "last_interval_clean": 140451, > "last_epoch_split": 108548, > "last_epoch_marked_full": 0, > "same_up_since": 141659, > "same_interval_since": 141659, > "same_primary_since": 141615, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "prior_readable_until_ub": 0 > }, > "stats": { > "version": "141624'723769", > "reported_seq": 3550912, > "reported_epoch": 141624, > "state": "active+recovery_unfound+degraded+remapped", > "last_fresh": "2023-01-03T11:57:15.480994+0000", > "last_change": "2023-01-03T11:31:07.335605+0000", > "last_active": "2023-01-03T11:57:15.480994+0000", > "last_peered": "2023-01-03T11:57:15.480994+0000", > "last_clean": "2022-12-31T19:25:42.758665+0000", > "last_became_active": > "2023-01-03T11:31:07.116154+0000", > "last_became_peered": > "2023-01-03T11:31:07.116154+0000", > "last_unstale": "2023-01-03T11:57:15.480994+0000", > "last_undegraded": "2023-01-03T11:31:07.099558+0000", > "last_fullsized": "2023-01-03T11:57:15.480994+0000", > "mapping_epoch": 141659, > "log_start": "138378'718235", > "ondisk_log_start": "138378'718235", > "created": 36041, > "last_epoch_clean": 140452, > "parent": "0.0", > "parent_split_bits": 10, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "log_size": 5534, > "ondisk_log_size": 5534, > "stats_invalid": false, > "dirty_stats_invalid": false, > "omap_stats_invalid": false, > "hitset_stats_invalid": false, > "hitset_bytes_stats_invalid": false, > "pin_stats_invalid": false, > "manifest_stats_invalid": false, > "snaptrimq_len": 0, > "stat_sum": { > "num_bytes": 281850993538, > "num_objects": 137711, > "num_object_clones": 0, > "num_object_copies": 826266, > "num_objects_missing_on_primary": 1, > "num_objects_missing": 0, > "num_objects_degraded": 5, > "num_objects_misplaced": 137707, > "num_objects_unfound": 1, > "num_objects_dirty": 137711, > "num_whiteouts": 0, > "num_read": 720741, > "num_read_kb": 180044133, > "num_write": 469883, > "num_write_kb": 101652239, > "num_scrub_errors": 0, > "num_shallow_scrub_errors": 0, > "num_deep_scrub_errors": 0, > "num_objects_recovered": 406583, > "num_bytes_recovered": 891273764570, > "num_keys_recovered": 0, > "num_objects_omap": 0, > "num_objects_hit_set_archive": 0, > "num_bytes_hit_set_archive": 0, > "num_flush": 0, > "num_flush_kb": 0, > "num_evict": 0, > "num_evict_kb": 0, > "num_promote": 0, > "num_flush_mode_high": 0, > "num_flush_mode_low": 0, > "num_evict_mode_some": 0, > "num_evict_mode_full": 0, > "num_objects_pinned": 0, > "num_legacy_snapsets": 0, > "num_large_omap_objects": 0, > "num_objects_manifest": 0, > "num_omap_bytes": 0, > "num_omap_keys": 0, > "num_objects_repaired": 3 > }, > "up": [ > 208, > 220, > 269, > 175, > 343, > 329 > ], > "acting": [ > 208, > 220, > 269, > 175, > 313, > 329 > ], > "avail_no_missing": [ > "329(5)" > ], > "object_location_counts": [ > { > "shards": > "175(3),208(0),220(1),269(2),313(4),329(5)", > "objects": 137710 > }, > { > "shards": "329(5)", > "objects": 1 > } > ], > "blocked_by": [], > "up_primary": 208, > "acting_primary": 208, > "purged_snaps": [] > }, > "empty": 0, > "dne": 0, > "incomplete": 0, > "last_epoch_started": 141660, > "hit_set_history": { > "current_last_update": "0'0", > "history": [] > } > }, > { > "peer": "343(4)", > "pgid": "404.1ffs4", > "last_update": "141747'724161", > "last_complete": "141668'723872", > "log_tail": "138378'718235", > "last_user_version": 723769, > "last_backfill": > "404:ff80001b:::1f244892-a2e7-406b-aa62-1b13511333a2.625411.3__multipart_2021-11-09T12%3a41%3a05,478440857+00%3a00.2~XwOh1_IhVzghlM-zi7DfwoUIZl5UxVt.1:head", > "purged_snaps": [], > "history": { > "epoch_created": 36041, > "epoch_pool_created": 27862, > "last_epoch_started": 141660, > "last_interval_started": 141659, > "last_epoch_clean": 140452, > "last_interval_clean": 140451, > "last_epoch_split": 108548, > "last_epoch_marked_full": 0, > "same_up_since": 141659, > "same_interval_since": 141659, > "same_primary_since": 141615, > "last_scrub": "140525'722168", > "last_scrub_stamp": "2022-12-31T19:27:10.993286+0000", > "last_deep_scrub": "140525'722168", > "last_deep_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "last_clean_scrub_stamp": > "2022-12-31T19:27:10.993286+0000", > "prior_readable_until_ub": 0 > }, > "stats": { > "version": "0'0", > "reported_seq": 0, > "reported_epoch": 0, > "state": "unknown", > "last_fresh": "0.000000", > "last_change": "0.000000", > "last_active": "0.000000", > "last_peered": "0.000000", > "last_clean": "0.000000", > "last_became_active": "0.000000", > "last_became_peered": "0.000000", > "last_unstale": "0.000000", > "last_undegraded": "0.000000", > "last_fullsized": "0.000000", > "mapping_epoch": 141659, > "log_start": "0'0", > "ondisk_log_start": "0'0", > "created": 0, > "last_epoch_clean": 0, > "parent": "0.0", > "parent_split_bits": 0, > "last_scrub": "0'0", > "last_scrub_stamp": "0.000000", > "last_deep_scrub": "0'0", > "last_deep_scrub_stamp": "0.000000", > "last_clean_scrub_stamp": "0.000000", > "log_size": 0, > "ondisk_log_size": 0, > "stats_invalid": false, > "dirty_stats_invalid": false, > "omap_stats_invalid": false, > "hitset_stats_invalid": false, > "hitset_bytes_stats_invalid": false, > "pin_stats_invalid": false, > "manifest_stats_invalid": false, > "snaptrimq_len": 0, > "stat_sum": { > "num_bytes": 9404416, > "num_objects": 3, > "num_object_clones": 0, > "num_object_copies": 0, > "num_objects_missing_on_primary": 0, > "num_objects_missing": 137907, > "num_objects_degraded": 0, > "num_objects_misplaced": 0, > "num_objects_unfound": 0, > "num_objects_dirty": 3, > "num_whiteouts": 0, > "num_read": 0, > "num_read_kb": 0, > "num_write": 0, > "num_write_kb": 0, > "num_scrub_errors": 0, > "num_shallow_scrub_errors": 0, > "num_deep_scrub_errors": 0, > "num_objects_recovered": 0, > "num_bytes_recovered": 0, > "num_keys_recovered": 0, > "num_objects_omap": 0, > "num_objects_hit_set_archive": 0, > "num_bytes_hit_set_archive": 0, > "num_flush": 0, > "num_flush_kb": 0, > "num_evict": 0, > "num_evict_kb": 0, > "num_promote": 0, > "num_flush_mode_high": 0, > "num_flush_mode_low": 0, > "num_evict_mode_some": 0, > "num_evict_mode_full": 0, > "num_objects_pinned": 0, > "num_legacy_snapsets": 0, > "num_large_omap_objects": 0, > "num_objects_manifest": 0, > "num_omap_bytes": 0, > "num_omap_keys": 0, > "num_objects_repaired": 0 > }, > "up": [ > 208, > 220, > 269, > 175, > 343, > 329 > ], > "acting": [ > 208, > 220, > 269, > 175, > 313, > 329 > ], > "avail_no_missing": [], > "object_location_counts": [], > "blocked_by": [], > "up_primary": 208, > "acting_primary": 208, > "purged_snaps": [] > }, > "empty": 0, > "dne": 0, > "incomplete": 1, > "last_epoch_started": 141660, > "hit_set_history": { > "current_last_update": "0'0", > "history": [] > } > } > ], > "recovery_state": [ > { > "name": "Started/Primary/Active", > "enter_time": "2023-01-03T12:15:49.758685+0000", > "might_have_unfound": [ > { > "osd": "175(3)", > "status": "already probed" > }, > { > "osd": "220(1)", > "status": "already probed" > }, > { > "osd": "269(2)", > "status": "already probed" > }, > { > "osd": "313(4)", > "status": "already probed" > }, > { > "osd": "329(5)", > "status": "already probed" > }, > { > "osd": "343(4)", > "status": "already probed" > } > ], > "recovery_progress": { > "backfill_targets": [ > "343(4)" > ], > "waiting_on_backfill": [], > "last_backfill_started": > "404:ff80001b:::1f244892-a2e7-406b-aa62-1b13511333a2.625411.3__multipart_2021-11-09T12%3a41%3a05,478440857+00%3a00.2~XwOh1_IhVzghlM-zi7DfwoUIZl5UxVt.1:head", > "backfill_info": { > "begin": "MIN", > "end": "MIN", > "objects": [] > }, > "peer_backfill_info": [], > "backfills_in_flight": [], > "recovering": [], > "pg_backend": { > "recovery_ops": [], > "read_ops": [] > } > } > }, > { > "name": "Started", > "enter_time": "2023-01-03T12:15:48.766366+0000" > } > ], > "scrubber": { > "epoch_start": "0", > "active": false > }, > "agent_state": {} > } > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

1 year, 3 months

1
0
0 0

User migration between clusters

by Szabo, Istvan (Agoda)

Hi, Normally I use rclone to migrate buckets across clusters. However this time the user has close to 1000 buckets so I wonder what would be the best approach to do this rather buckets by buckets, any idea? Thank you ________________________________ This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.

1 year, 3 months

1
0
0 0

ceph orch osd rm - draining forever, shows -1 pgs

by Wyll Ingersoll

Running ceph-pacific 16.2.9 using ceph orchestrator. We made a mistake adding a disk to the cluster and immediately issued a command to remove it using "ceph orch osd rm ### --replace --force". This OSD no data on it at the time and was removed after just a few minutes. "ceph orch osd rm status" shows that it is still "draining". ceph osd df shows that the osd being removed has -1 PGs. So - why is the simple act of removal taking so long and can we abort it and manually remove that osd somehow? Note: the cluster is also doing a rebalance while this is going on, but the osd being removed never had any data and should not be affected by the rebalance. thanks!

1 year, 3 months

1
0
0 0

VolumeGroup must have a non-empty name / 17.2.5

by Peter Eisch

Hi, I updated from pacific 16.2.10 to 17.2.5 and the orchestration update went perfectly. Very impressive. I have one host which then started throwing a cephadm warning after the upgrade. 2023-01-07 11:17:50,080 7f0b26c8ab80 INFO Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45 -e NODE_NAME=kelli.domain.name -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/404b94ab-b4d6-4218-9a4e-ecb8899108ca:/var/run/ceph:z -v /var/log/ceph/404b94ab-b4d6-4218-9a4e-ecb8899108ca:/var/log/ceph:z -v /var/lib/ceph/404b94ab-b4d6-4218-9a4e-ecb8899108ca/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /var/lib/ceph/404b94ab-b4d6-4218-9a4e-ecb8899108ca/selinux:/sys/fs/selinux:ro -v /:/rootfs -v /tmp/ceph-tmpltrnmxf8:/etc/ceph/ceph.conf:z quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45 inventory --format=json-pretty --filter-for-batch 2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr Traceback (most recent call last): 2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/sbin/ceph-volume", line 11, in <module> 2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')() 2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__ 2023-01-07 11:17:50,081 7f0b26c8ab80 INFO /usr/bin/podman: stderr self.main(self.argv) 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr return f(*a, **kw) 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr terminal.dispatch(self.mapper, subcommand_args) 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr instance.main() 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/inventory/main.py", line 53, in main 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr with_lsm=self.args.with_lsm)) 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/device.py", line 39, in __init__ 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr all_devices_vgs = lvm.get_all_devices_vgs() 2023-01-07 11:17:50,082 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/api/lvm.py", line 797, in get_all_devices_vgs 2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr return [VolumeGroup(**vg) for vg in vgs] 2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/api/lvm.py", line 797, in <listcomp> 2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr return [VolumeGroup(**vg) for vg in vgs] 2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/api/lvm.py", line 517, in __init__ 2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr raise ValueError('VolumeGroup must have a non-empty name') 2023-01-07 11:17:50,083 7f0b26c8ab80 INFO /usr/bin/podman: stderr ValueError: VolumeGroup must have a non-empty name This host is the only one which has 14 drives which aren't being used. I'm guessing this is why its getting this error. The drives may have been used previous in a cluster (maybe not the same cluster) or something. I don't know. Any suggestions for what to try to get past this issue? peter Peter Eisch DevOps Manager peter.eisch(a)virginpulse.com T1.612.445.5135 Confidentiality Notice: This email was sent securely using Transport Layer Security (TLS) Encryption. Please ensure your email systems support TLS before replying with any confidential information. The information contained in this e-mail, including any attachment(s), is intended solely for use by the designated recipient(s). Unauthorized use, dissemination, distribution, or reproduction of this message by anyone other than the intended recipient(s), or a person designated as responsible for delivering such messages to the intended recipient, is strictly prohibited and may be unlawful. This e-mail may contain proprietary, confidential or privileged information. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Virgin Pulse, Inc. If you have received this message in error, or are not the named recipient(s), please immediately notify the sender and delete this e-mail message. v3.02

1 year, 3 months

2
1
0 0

Mixing SSD and HDD disks for data in ceph cluster deployment

by Michel Niyoyita

Hello team I have an issue on ceph-deployment using ceph-ansible . we have two categories of disk , HDD and SSD , while deploying ceph only HDD are appearing no SSD appearing . the cluster is running on ubuntu OS 20.04 , unfortunately no errors appearing , did I miss something in configuration? hdd: 7,2936 T ssd: 7 T Kindly advise and help below is how the cluster is behavior, actually we have 20 disks per each host (16 hdd and 4 ssd), from /dev/sda to /dev/sdt but if you look after deployment we have 48 osds instead of 60 osds which are missing are ssd according to ceph osd crush class ls command . root@ceph-mon1:~# ceph osd crush class ls [ "hdd" ] root@ceph-mon1:~# ceph -s cluster: id: 02786875-6dca-46e6-8590-dba92c27e6f8 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 36m) mgr: ceph-mon1(active, since 5m), standbys: ceph-mon3, ceph-mon2 osd: 48 osds: 48 up (since 30m), 48 in (since 31m) rgw: 3 daemons active (3 hosts, 1 zones) data: pools: 5 pools, 105 pgs objects: 195 objects, 7.7 KiB usage: 21 TiB used, 349 TiB / 370 TiB avail pgs: 105 active+clean and below is the output of lsblk command which shows that there is LVM created on ssd but its size doesn't appear . from /dev/sde to /dev/sdh are ssd . sda 8:0 0 7.3T 0 disk └─ceph--d856437a--0af5--48e1--a99d--b9f8f5b74165-osd--block--2856935f--a293--47b2--a917--7671e91e207d 253:0 0 7.3T 0 lvm sdb 8:16 0 7.3T 0 disk └─ceph--60119ee2--6537--4129--88b2--82bc0994766f-osd--block--c87a88ca--2c34--4a71--bffa--55789475e950 253:2 0 7.3T 0 lvm sdc 8:32 0 7.3T 0 disk └─ceph--1667e64f--8327--4a13--886e--23bbd8d4fb73-osd--block--1b9da829--4210--4956--b3c3--32042c719879 253:5 0 7.3T 0 lvm sdd 8:48 0 7.3T 0 disk └─ceph--4b9a897f--253e--4078--85fc--0133baad23c6-osd--block--fc9d3bb0--0d63--4201--a49a--6204a928bbd4 253:7 0 7.3T 0 lvm sde 8:64 0 7T 0 disk ├─ceph--54bc00ad--d575--4269--b637--a15e4aed88d2-osd--db--21ccd262--0488--49b4--9994--9cf8a61b7bbd 253:26 0 447.1G 0 lvm ├─ceph--54bc00ad--d575--4269--b637--a15e4aed88d2-osd--db--404cefe2--c24c--4955--89c9--28e8d7c44db9 253:28 0 447.1G 0 lvm ├─ceph--54bc00ad--d575--4269--b637--a15e4aed88d2-osd--db--6bea87e8--d883--46a8--b9de--e4b7b311ce20 253:30 0 447.1G 0 lvm └─ceph--54bc00ad--d575--4269--b637--a15e4aed88d2-osd--db--6767b370--a63f--402e--864c--0085bc589baa 253:32 0 447.1G 0 lvm sdf 8:80 0 7T 0 disk ├─ceph--7461bded--a2c2--4874--a08f--fca940f3a511-osd--db--b62845ed--ce46--4afd--8c48--79afd89409ca 253:18 0 447.1G 0 lvm ├─ceph--7461bded--a2c2--4874--a08f--fca940f3a511-osd--db--a40a1938--ea8b--4c9b--bf10--01ca097719ca 253:20 0 447.1G 0 lvm ├─ceph--7461bded--a2c2--4874--a08f--fca940f3a511-osd--db--abfa052f--07e4--4504--9cf8--eee3b527489c 253:22 0 447.1G 0 lvm └─ceph--7461bded--a2c2--4874--a08f--fca940f3a511-osd--db--5eef2536--829f--4893--9b94--51ae55bfd742 253:24 0 447.1G 0 lvm sdg 8:96 0 7T 0 disk ├─ceph--fb912c50--2c94--403b--b690--73ef6f4eda69-osd--db--bd3fff8a--40d2--4ed9--b15e--52812da3a15d 253:10 0 447.1G 0 lvm ├─ceph--fb912c50--2c94--403b--b690--73ef6f4eda69-osd--db--1bd422bf--5d7e--468f--a7ca--3c70d38dcd5b 253:12 0 447.1G 0 lvm ├─ceph--fb912c50--2c94--403b--b690--73ef6f4eda69-osd--db--cecccf90--6b03--47a1--9ce1--8b6d403816e1 253:14 0 447.1G 0 lvm └─ceph--fb912c50--2c94--403b--b690--73ef6f4eda69-osd--db--b8181faf--595b--4118--a35a--e51a527d6e57 253:16 0 447.1G 0 lvm sdh 8:112 0 7T 0 disk ├─ceph--bc347c07--8636--462e--82fd--d81ed149ed2c-osd--db--9d14b346--ff65--4045--95db--4c5cb43e7829 253:1 0 447.1G 0 lvm ├─ceph--bc347c07--8636--462e--82fd--d81ed149ed2c-osd--db--ec81234e--f12f--414d--8210--bf9f6ccd5d7e 253:3 0 447.1G 0 lvm ├─ceph--bc347c07--8636--462e--82fd--d81ed149ed2c-osd--db--c8135b3f--42c5--44f9--885c--6413a16b86b4 253:6 0 447.1G 0 lvm └─ceph--bc347c07--8636--462e--82fd--d81ed149ed2c-osd--db--14ba1fee--a896--4cdf--8ded--ba2e08a4c0d5 253:8 0 447.1G 0 lvm sdi 8:128 0 7.3T 0 disk └─ceph--5737bd78--e275--44f0--af42--20c70f6ecce0-osd--block--3aa9e13f--fee6--4e6f--8696--4dfa855def35 253:9 0 7.3T 0 lvm sdj 8:144 0 7.3T 0 disk └─ceph--5420bf20--6fa4--4d70--8020--3df67e6a664a-osd--block--74157cb4--74f9--43dc--af00--9f180cda1dc9 253:11 0 7.3T 0 lvm sdk 8:160 0 7.3T 0 disk └─ceph--726e1a55--a71b--4fe7--baa2--c48e80a7cabd-osd--block--461ca84d--1bdc--4df8--8235--7b8d7cfd9c86 253:13 0 7.3T 0 lvm sdl 8:176 0 7.3T 0 disk └─ceph--f4dfe3a2--6494--4ccf--9062--991524995bfb-osd--block--f8bd6329--9c62--4b75--8bb9--c34271b4a09b 253:15 0 7.3T 0 lvm sdm 8:192 0 7.3T 0 disk └─ceph--53e9c97b--714b--415c--90bd--7f5faf7d7389-osd--block--8917684c--129a--4754--a0d7--3e347e07de21 253:17 0 7.3T 0 lvm sdn 8:208 0 7.3T 0 disk └─ceph--9f14efa0--208a--4b6c--9063--04b73151eb02-osd--block--a9d289d8--ed05--4e41--8d52--598d3cbaa61d 253:19 0 7.3T 0 lvm sdo 8:224 0 7.3T 0 disk └─ceph--c396bc2b--bf57--4096--816a--58f383dda801-osd--block--cba8794d--8ab4--4765--9df1--9510927d02eb 253:21 0 7.3T 0 lvm sdp 8:240 0 7.3T 0 disk └─ceph--97b6b983--a0e7--4521--b2b4--c8ab8d251ef7-osd--block--616e2606--aa7b--4d36--9996--3ac188b1fb39 253:23 0 7.3T 0 lvm sdq 65:0 0 7.3T 0 disk └─ceph--e36dad3b--b6d2--4949--9e18--920f65807a6a-osd--block--58d102ba--bad8--47eb--8c3f--4ece53ab5d6a 253:25 0 7.3T 0 lvm sdr 65:16 0 7.3T 0 disk └─ceph--9a24f150--d275--456c--8d85--f5db51800557-osd--block--5e751e55--00ed--497c--b6e7--b802108bbb1e 253:27 0 7.3T 0 lvm sds 65:32 0 7.3T 0 disk └─ceph--2daaf0ec--1483--4ffa--b831--846bc2ed3fe5-osd--block--e2a23118--41f4--4285--8b1c--de25419ff56e 253:29 0 7.3T 0 lvm sdt 65:48 0 7.3T 0 disk └─ceph--660fb368--1dad--4275--8868--522f4e60354a-osd--block--e8731791--4560--4b22--aa4f--a650893b100d 253:31 0 7.3T 0 lvm Best Regards Michel

1 year, 3 months

3
3
0 0

Erasing Disk to the initial state

by Michel Niyoyita

Hello team, I have deployed ceph cluster in production , the cluster composed by two types of disks HDD and SSD , and the cluster was deployed using ceph-ansible , unfortunately after deployment the HDD disks appear only without SSD , would like to restart deployment from scratch , but I miss the way on how to erase disk to the initial state . try to format disks but LVM comeback with disks. sda 8:0 0 7.3T 0 disk └─ceph--da4a5d58--73ef--473b--9960--371f837cb5ed-osd--block--6e800937--c4d2--4fc9--84ca--083c39d057a8 253:1 0 7.3T 0 lvm sdb 8:16 0 7.3T 0 disk └─ceph--773f50a1--79ed--4908--8f81--74f85efeb473-osd--block--9737a046--ba8b--4494--91f7--b80dd894df0b 253:7 0 7.3T 0 lvm sdc 8:32 0 7.3T 0 disk └─ceph--02000cec--fdbc--4def--967e--a7c32c851964-osd--block--c54d8182--b5e7--4c73--8d7b--7d24c7a3ce15 253:6 0 7.3T 0 lvm Kindly help me to sort this out. Best regards Michel

1 year, 3 months

2
1
0 0

NoSuchBucket when bucket exists ..

by Shashi Dahal

Hi, In a working All-in-one test setup ( where making the bucket public works from the browser) radosgw-admin bucket list [ "711138fc95764303b83002c567ce0972/demo" ] I have another cluster where openstack and ceph are separate. I have set same config options in ceph.conf .. rgw_enable_apis = swift rgw_keystone_accepted_roles = member, _member_, admin, swiftoperator rgw_keystone_admin_domain = default rgw_keystone_admin_password = **** rgw_keystone_admin_project = service rgw_keystone_admin_user = **** rgw_keystone_api_version = 3 rgw_keystone_implicit_tenants = true rgw_keystone_url = https://<keystone-url>:5000 rgw_swift_account_in_url = true rgw_swift_versioning_enabled = true but the output is different radosgw-admin bucket list [ "demo", ] ## this is created without the project-uuid. What is happening is when I make the bucket public, it gives https://cloud.domain.com:8080/swift/v1/AUTH_b9a4b517525a483a9e111044713bfa1… -> NoSuchBucket Please let me know what setting could I be missing so that when the bucket is created, it is created with the project_id as well and the link works when the bucket is public. Thanks

1 year, 3 months

1
1
0 0

Serious cluster issue - Incomplete PGs

by Deep Dish

Hello. I really screwed up my ceph cluster. Hoping to get data off it so I can rebuild it. In summary, too many changes too quickly caused the cluster to develop incomplete pgs. Some PGS were reporting that OSDs were to be probes. I've created those OSD IDs (empty), however this wouldn't clear incompletes. Incompletes are part of EC pools. Running 17.2.5. This is the overall state: cluster: id: 49057622-69fc-11ed-b46e-d5acdedaae33 health: HEALTH_WARN Failed to apply 1 service(s): osd.dashboard-admin-1669078094056 1 hosts fail cephadm check cephadm background work is paused Reduced data availability: 28 pgs inactive, 28 pgs incomplete Degraded data redundancy: 55 pgs undersized 2 slow ops, oldest one blocked for 4449 sec, daemons [osd.25,osd.50,osd.51] have slow ops. These are PGs that are incomplete that HAVE DATA (Objects > 0) [ via ceph pg ls incomplete ]: 2.35 23199 0 0 0 95980273664 0 0 2477 incomplete 10s 2104'46277 28260:686871 [44,4,37,3,40,32]p44 [44,4,37,3,40,32]p44 2023-01-03T03:54:47.821280+0000 2022-12-29T18:53:09.287203+0000 14 queued for deep scrub 2.53 22821 0 0 0 94401175552 0 0 2745 remapped+incomplete 10s 2104'45845 28260:565267 [60,48,52,65,67,7]p60 [60]p60 2023-01-03T10:18:13.388383+0000 2023-01-03T10:18:13.388383+0000 408 queued for scrub 2.9f 22858 0 0 0 94555983872 0 0 2736 remapped+incomplete 10s 2104'45636 28260:759872 [56,59,3,57,5,32]p56 [56]p56 2023-01-03T10:55:49.848693+0000 2023-01-03T10:55:49.848693+0000 376 queued for scrub 2.be 22870 0 0 0 94429110272 0 0 2661 remapped+incomplete 10s 2104'45561 28260:813759 [41,31,37,9,7,69]p41 [41]p41 2023-01-03T14:02:15.790077+0000 2023-01-03T14:02:15.790077+0000 360 queued for scrub 2.e4 22953 0 0 0 94912278528 0 0 2648 remapped+incomplete 20m 2104'46048 28259:732896 [37,46,33,4,48,49]p37 [37]p37 2023-01-02T18:38:46.268723+0000 2022-12-29T18:05:47.431468+0000 18 queued for deep scrub 17.78 20169 0 0 0 84517834400 0 0 2198 remapped+incomplete 10s 3735'53405 28260:1243673 [4,37,2,36,66,0]p4 [41]p41 2023-01-03T14:21:41.563424+0000 2023-01-03T14:21:41.563424+0000 348 queued for scrub 17.d8 20328 0 0 0 85196053130 0 0 1852 remapped+incomplete 10s 3735'54458 28260:1309564 [38,65,61,37,58,39]p38 [53]p53 2023-01-02T18:32:35.371071+0000 2022-12-28T19:08:29.492244+0000 21 queued for deep scrub At present I'm unable to reliably access my data due to incomplete pages above. I'll post whatever outputs requested (won't post now as it can be rather verbose). Is there hope?

1 year, 3 months

2
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users January 2023