I'm trying to deploy a ceph cluster with a cephadm tool. I've already successfully done all steps except adding OSDs. My testing equipment consists of three hosts. Each host has SSD storage, where OS is installed into. On that storage I created partition, which can be used as a ceph block.db. Hosts have also 2 additional HDs (spinning drives) for OSD data. On docs I couldn't find how to deploy such configuration. Do you have any hints, how to do that?
Thanks for help!
Hi
One of my clusters running nautilus 14.2.8 is very slow (13 seconds or so
where my other clusters are returning almost instantanious) when doing a
'rados --pool rc3-se.rgw.buckets.index ls' from one of the monitors.
I checked
- ceph status => OK
- routing to/from osds ok (I see a lot of established connections to osds
due to the command, nothing in syn_sent indicating incomplete handshake)
- ping times are OK
- no interface errors
- no packet drops
- no increasing send queus
- and as far as I can see nothing out of the ordinary in mon and osd logs
I have no clue how to debug the issue. If someone has pointers it would be
much appreciated
Kind Regards
Marcel
Hello,
I've been trying to understand if there is any way to get usage information based on storage classes for buckets.
Since there is no information available from the "radosgw-admin bucket stats" commands nor any other endpoint I
tried to browse the source code but couldn't find any references where the storage class would be exposed in such a way.
It also seems that RadosGW today is not saving any counters on amount of objects stored in storage classes when it's
collecting usage stats, which means there is no such metadata saved for a bucket.
I was hoping it was atleast saved but not exposed because then it would have been a easier fix than adding support to count number of objects in storage classes based on operations which would involve a lot of places and mean writing to the bucket metadata on each op :(
Is my assumptions correct that there is no way to retrieve such information, meaning there is no way to measure such usage?
If the answer is yes, I assume the only way to get something that could be measured would be to instead have multiple placement
targets since that is exposed from in bucket info. The bad things would be though that you lose a lot of functionality related to lifecycle
and moving a single object to another storage class.
Best regards
Tobias
Many students can’t complete their assignments within the assigned date because of some unavoidable circumstances in Hong Kong. It’s true that your mental stress raises hindrance and disturbs your concentration for your work. Because of stress and tension, you will not have enough thoughts to make your work effective. This situation directs you to the platform of Assignment Help services even in Hong Kong. This is because an unstable mind could not generate the right ideas to compose a worthy assignment. Make your assignment informative and productive using the online writing services of assignment experts. Professionals have a better understanding of the subject and know how to frame all information in the right format. So, you can use online assignment help when you have issues in composing your academic papers irrespective of any subject.
https://www.greatassignmenthelp.com/hk/
Hello everybody,
Can somebody add support for Debian buster and ceph-deploy:
https://tracker.ceph.com/issues/42870
Highly appreciated,
Regards,
Jelle de Jong
Ceph cluster is updated from nautilus to octopus. On ceph-osd nodes we have
high I/O wait.
After increasing one of pool’s pg_num from 64 to 128 according to warning
message (more objects per pg), this lead to high cpu load and ram usage on
ceph-osd nodes and finally crashed the whole cluster. Three osds, one on
each host, stuck at down state (osd.34 osd.35 osd.40).
Starting the down osd service causes high ram usage and cpu load and
ceph-osd node to crash until the osd service fails.
The active mgr service on each mon host will crash after consuming almost
all available ram on the physical hosts.
I need to recover pgs and solving corruption. How can i recover unknown and
down pgs? Is there any way to starting up failed osd?
Below steps are done:
1- osd nodes’ kernel was upgraded to 5.4.2 before ceph cluster upgrading.
Reverting to previous kernel 4.2.1 is tested for iowate decreasing, but it
had no effect.
2- Recovering 11 pgs on failed osds by export them using
ceph-objectstore-tools utility and import them on other osds. The result
followed: 9 pgs are “down” and 2 pgs are “unknown”.
2-1) 9 pgs export and import successfully but status is “down” because of
"peering_blocked_by" 3 failed osds. I cannot lost osds because of
preventing unknown pgs from getting lost. pgs size in K and M.
"peering_blocked_by": [
{
"osd": 34,
"current_lost_at": 0,
"comment": "starting or marking this osd lost may let us proceed"
},
{
"osd": 35,
"current_lost_at": 0,
"comment": "starting or marking this osd lost may let us proceed"
},
{
"osd": 40,
"current_lost_at": 0,
"comment": "starting or marking this osd lost may let us proceed"
}
]
2-2) 1 pg (2.39) export and import successfully, but after starting osd
service (pg import to it), ceph-osd node RAM and CPU consumption increase
and cause ceph-osd node to crash until the osd service fails. Other osds
become "down" on ceph-osd node. pg status is “unknown”. I cannot use
"force-create-pg" because of data lost. pg 2.39 size is 19G.
# ceph pg map 2.39
osdmap e40347 pg 2.39 (2.39) -> up [32,37] acting [32,37]
# ceph pg 2.39 query
Error ENOENT: i don't have pgid 2.39
*pg 2.39 info on failed osd:
# ceph-objectstore-tool --data-path /var/lib/ceph/osd/*ceph-34* --op info
--pgid 2.39
{
"pgid": "2.39",
"last_update": "35344'6456084",
"last_complete": "35344'6456084",
"log_tail": "35344'6453182",
"last_user_version": 10595821,
"last_backfill": "MAX",
"purged_snaps": [],
"history": {
"epoch_created": 146,
"epoch_pool_created": 79,
"last_epoch_started": 25208,
"last_interval_started": 25207,
"last_epoch_clean": 25208,
"last_interval_clean": 25207,
"last_epoch_split": 370,
"last_epoch_marked_full": 0,
"same_up_since": 8347,
"same_interval_since": 25207,
"same_primary_since": 8321,
"last_scrub": "35328'6440139",
"last_scrub_stamp": "2020-08-19T12:00:59.377593+0430",
"last_deep_scrub": "35261'6031075",
"last_deep_scrub_stamp": "2020-08-17T01:59:26.606037+0430",
"last_clean_scrub_stamp": "2020-08-19T12:00:59.377593+0430",
"prior_readable_until_ub": 0
},
"stats": {
"version": "35344'6456082",
"reported_seq": "11733156",
"reported_epoch": "35344",
"state": "active+clean",
"last_fresh": "2020-08-19T14:16:18.587435+0430",
"last_change": "2020-08-19T12:00:59.377747+0430",
"last_active": "2020-08-19T14:16:18.587435+0430",
"last_peered": "2020-08-19T14:16:18.587435+0430",
"last_clean": "2020-08-19T14:16:18.587435+0430",
"last_became_active": "2020-08-06T00:23:51.016769+0430",
"last_became_peered": "2020-08-06T00:23:51.016769+0430",
"last_unstale": "2020-08-19T14:16:18.587435+0430",
"last_undegraded": "2020-08-19T14:16:18.587435+0430",
"last_fullsized": "2020-08-19T14:16:18.587435+0430",
"mapping_epoch": 8347,
"log_start": "35344'6453182",
"ondisk_log_start": "35344'6453182",
"created": 146,
"last_epoch_clean": 25208,
"parent": "0.0",
"parent_split_bits": 7,
"last_scrub": "35328'6440139",
"last_scrub_stamp": "2020-08-19T12:00:59.377593+0430",
"last_deep_scrub": "35261'6031075",
"last_deep_scrub_stamp": "2020-08-17T01:59:26.606037+0430",
"last_clean_scrub_stamp": "2020-08-19T12:00:59.377593+0430",
"log_size": 2900,
"ondisk_log_size": 2900,
"stats_invalid": false,
"dirty_stats_invalid": false,
"omap_stats_invalid": false,
"hitset_stats_invalid": false,
"hitset_bytes_stats_invalid": false,
"pin_stats_invalid": false,
"manifest_stats_invalid": false,
"snaptrimq_len": 0,
"stat_sum": {
"num_bytes": 19749578960,
"num_objects": 2442,
"num_object_clones": 20,
"num_object_copies": 7326,
"num_objects_missing_on_primary": 0,
"num_objects_missing": 0,
"num_objects_degraded": 0,
"num_objects_misplaced": 0,
"num_objects_unfound": 0,
"num_objects_dirty": 2442,
"num_whiteouts": 0,
"num_read": 16120686,
"num_read_kb": 82264126,
"num_write": 19731882,
"num_write_kb": 379030181,
"num_scrub_errors": 0,
"num_shallow_scrub_errors": 0,
"num_deep_scrub_errors": 0,
"num_objects_recovered": 2861,
"num_bytes_recovered": 21673259070,
"num_keys_recovered": 32,
"num_objects_omap": 2,
"num_objects_hit_set_archive": 0,
"num_bytes_hit_set_archive": 0,
"num_flush": 0,
"num_flush_kb": 0,
"num_evict": 0,
"num_evict_kb": 0,
"num_promote": 0,
"num_flush_mode_high": 0,
"num_flush_mode_low": 0,
"num_evict_mode_some": 0,
"num_evict_mode_full": 0,
"num_objects_pinned": 0,
"num_legacy_snapsets": 0,
"num_large_omap_objects": 0,
"num_objects_manifest": 0,
"num_omap_bytes": 152,
"num_omap_keys": 16,
"num_objects_repaired": 0
},
"up": [
40,
35,
34
],
"acting": [
40,
35,
34
],
"avail_no_missing": [],
"object_location_counts": [],
"blocked_by": [],
"up_primary": 40,
"acting_primary": 40,
"purged_snaps": []
},
"empty": 0,
"dne": 0,
"incomplete": 0,
"last_epoch_started": 25208,
"hit_set_history": {
"current_last_update": "0'0",
"history": []
}
}
*pg 2.39 info on osd which import to it:
# ceph-objectstore-tool --data-path /var/lib/ceph/osd/*ceph-37* --op info
--pgid 2.39
PG '2.39' not found
2-3) 1 pg (2.79) is lost! This pg is not found on any of three failed osds
(osd.34 osd.35 osd.40)! status is “unknown”. pg 2.79 export is failed: "
PG '2.79' not found"
# ceph pg map 2.79
Error ENOENT: i don't have pgid 2.79
# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-34 --op info
--pgid 2.79
PG '2.79' not found
3- Using https://gitlab.lbader.de/kryptur/ceph-recovery/tree/master but it
does not work for recent ceph versions and tested on “hammer” release.
4- Using https://ceph.io/planet/recovering-from-a-complete-node-failure/
but in lvm scenario I could not mount failed osd lv to new
/var/lib/ceph/osd/ceph-x* .*Could not prepare and activate new osd to
failed osd disk.
5- Setting pool min_size=1 that down pgs belong to it, restart osds that
pgs import to them but no changes.
6- Seting pool min_size=1 that pg 2.39 belong to it, restart osds that pg
import to them but no changes.
7- Repairing failed osds using ceph-objectstore-tools, making “in” and
starting them but no changes.
# ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-x --op repair
8- Repairing 2 unknown pgs, but no changes.
# ceph pg repaire 2.39
# ceph pg repair 2.79
9- Forcing recovery 2 unknown pgs, but no changes.
# ceph pg force-recovery 2.39
# ceph pg force-recovery 2.79
10- Check PID count in ceph-osd nodes because of osd services failed to
start.
kernel.pid.max = 4194304
11- Raising osd_op_thread_suicide_timeout=900, but no change.
I have a productive 60 osd's cluster. No extra Journals. Its performing okay. Now I added an extra ssd Pool with 16 Micron 5100 MAX. And the performance is little slower or equal to the 60 hdd pool. 4K random as also sequential reads. All on dedicated 2 times 10G Network. HDDS are still on filestore. SSD on bluestore. Ceph Luminous.
What should be possible 16 ssd's vs. 60 hhd's no extra journals?
Hi all,
I'm trying to migrate my 5-node cluster to Nautilus… Debian Stretch (I
intend to update this once I'm onto Nautilus) as the OS, all 5 machines
are combined OSD/MON/MDS nodes.
Nodes were originally deployed with an earlier version of Ceph (can't
recall now) using filestore on btrfs. Later, after an abortive attempt
to move to Bluestore (it ran… veeeeerrryyy ssslllooowwwlllyyy), I moved
back to filestore on xfs, which whilst no speed demon, worked well enough.
Journal is on the OSD due to space constraints elsewhere in the nodes.
So step 1:
> If you are unsure whether or not your Luminous cluster has completed a full scrub of all PGs, you can check your cluster’s state by running:
>
> # ceph osd dump | grep ^flags
>
> In order to be able to proceed to Nautilus, your OSD map must include the recovery_deletes and purged_snapdirs flags.
Tick to that, I see both flags.
Step 2.
> Make sure your cluster is stable and healthy (no down or recovering OSDs). (Optional, but recommended.)
Tick, all healthy.
Step 3.
> Set the noout flag for the duration of the upgrade. (Optional, but recommended.):
Done
Step 4.
> Upgrade monitors by installing the new packages and restarting the monitor daemons.
Okay, so I update /etc/apt/sources.list.d/ceph.list to point to the
`nautilus` repository, `apt-get update`, `apt-get dist-upgrade -y`.
This goes without a hitch, I now have Ceph 12 binaries on my nodes.
> systemctl restart ceph-mon.target
Ran that, on all of them (and yes, they are all in the quorum)…
> # ceph mon dump | grep min_mon_release
>
> should report:
>
> min_mon_release 14 (nautilus)
I get this:> root@helium:~# ceph mon dump | grep min_mon_release
> dumped monmap epoch 4
No `min_mon_release` mentioned anywhere. I tried re-starting the MON
daemons on all 5 nodes, even doing it in parallel using Ansible… no dice.
Nothing says what to do at this point. It's not reporting an earlier
release, it's just not reporting full stop.
Figuring that, well, I *did* update all the monitors, and re-start them,
I press on.
Step 5.
> Upgrade ceph-mgr daemons by installing the new packages and restarting all manager daemons.
Done, no issues.
> Verify the ceph-mgr daemons are running by checking ceph -s:
I get:
> root@helium:~# ceph -s
> cluster:
> id: 45b532b7-aa3d-4754-9906-d4a70b57630c
> health: HEALTH_WARN
> noout flag(s) set
>
> services:
> mon: 5 daemons, quorum hydrogen,helium,carbon,nitrogen,boron
> mgr: hydrogen(active), standbys: carbon, helium
> mds: cephfs-1/1/1 up {0=helium=up:active}, 2 up:standby
> osd: 5 osds: 5 up, 5 in
> flags noout
>
> data:
> pools: 4 pools, 324 pgs
> objects: 302.75k objects, 1.13TiB
> usage: 3.52TiB used, 5.53TiB / 9.05TiB avail
> pgs: 323 active+clean
> 1 active+clean+scrubbing+deep
>
> io:
> client: 45.0KiB/s rd, 1.10MiB/s wr, 1op/s rd, 38op/s wr
So far so good.
Step 6.
> Upgrade all OSDs by installing the new packages and restarting the ceph-osd daemons on all OSD hosts:
> systemctl restart ceph-osd.target
I actually missed this step at first and went on to step 7, but doubled
back and did it.
Step 7.
> If there are any OSDs in the cluster deployed with ceph-disk (e.g., almost any OSDs that were created before the Mimic release), you need to tell ceph-volume to adopt responsibility for starting the daemons.
Okay, seems simple enough. Now I did this by mistake earlier, so I'll
force it to ensure everything is up to scratch.
> root@hydrogen:~# ceph-volume simple scan --force
> stderr: lsblk: /var/lib/ceph/osd/ceph-5: not a block device
> stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
> Running command: /sbin/cryptsetup status /dev/sdc1
> --> OSD 5 got scanned and metadata persisted to file: /etc/ceph/osd/5-2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe.json
> --> To take over managment of this scanned OSD, and disable ceph-disk and udev, run:
> --> ceph-volume simple activate 5 2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe
> root@hydrogen:~# ceph-volume simple activate --all
> --> activating OSD specified in /etc/ceph/osd/5-2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe.json
> --> Required devices (data, and journal) not present for filestore
> --> filestore devices found: [u'data']
> --> RuntimeError: Unable to activate filestore OSD due to missing devices
Uhh ohh? So, my journal is on the OSD itself:
> root@hydrogen:~# ls -l /var/lib/ceph/osd/ceph-5/journal
> -rw-r--r-- 1 ceph ceph 21474836480 Aug 29 14:16 /var/lib/ceph/osd/ceph-5/journal
> root@hydrogen:~# cat /etc/ceph/osd/5-2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe.json
> {
> "active": "ok",
> "ceph_fsid": "45b532b7-aa3d-4754-9906-d4a70b57630c",
> "cluster_name": "ceph",
> "data": {
> "path": "/dev/sdc1",
> "uuid": "2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe"
> },
> "fsid": "2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe",
> "keyring": "…censored…",
> "magic": "ceph osd volume v026",
> "ready": "ready",
> "require_osd_release": "",
> "systemd": "",
> "type": "filestore",
> "whoami": 5
> }
How do I tell it that the journal is on `/dev/sdc1` as a file within the
XFS file store?
--
Stuart Longland (aka Redhatter, VK4MSL)
I haven't lost my mind...
...it's backed up on a tape somewhere.
Hi,
I've had a complete monitor failure, which I have recovered from with the steps here: https://docs.ceph.com/docs/mimic/rados/troubleshooting/troubleshooting-mon/…
The data and metadata pools are there and are completely intact, but ceph is reporting that there are no filesystems, where (before the failure) there was one.
Is there any way of putting the filesystem back together again without having to resort to having to rebuild a new metadata pool with cephfs-data-scan?
I'm on ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)
Thanks,
Harlan
Hi all,
We have a ceph cluster in production with 6 osds servers (with 16x8TB
disks), 3 mons/mgrs and 3 mdss. Both public and cluster networks are in
10GB and works well.
After a major crash in april, we turned the option bluefs_buffered_io to
false to workaround the large write bug when bluefs_buffered_io was
true (we were in version 14.2.8 and the default value at this time was
true).
Since that time, we regularly have some osds wrongly marked down by the
cluster after heartbeat timeout (heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15).
Generally the osd restart and the cluster is back healthy, but several
time, after many of these kick-off the osd reach the
osd_op_thread_suicide_timeout and goes down definitely.
We increased the osd_op_thread_timeout and
osd_op_thread_suicide_timeout... The problems still occurs (but less
frequently).
Few days ago, we upgraded to 14.2.11 and revert the timeout to their
default value, hoping that it will solve the problem (we thought that it
should be related to this bug https://tracker.ceph.com/issues/45943),
but it didn't. We still have some osds wrongly marked down.
Can somebody help us to fix this problem ?
Thanks.
Here is an extract of an osd log at failure time:
---------------------------------
2020-08-28 02:19:05.019 7f03f1384700 0 log_channel(cluster) log [DBG] :
44.7d scrub starts
2020-08-28 02:19:25.755 7f040e43d700 1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15
2020-08-28 02:19:25.755 7f040dc3c700 1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15
this last line is repeated more than 1000 times
...
2020-08-28 02:20:17.484 7f040d43b700 1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15
2020-08-28 02:20:17.551 7f03f1384700 0
bluestore(/var/lib/ceph/osd/ceph-16) log_latency_fn slow operation
observed for _collection_list, latency = 67.3532s, lat = 67s cid
=44.7d_head start GHMAX end GHMAX max 25
...
2020-08-28 02:20:22.600 7f040dc3c700 1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15
2020-08-28 02:21:20.774 7f03f1384700 0
bluestore(/var/lib/ceph/osd/ceph-16) log_latency_fn slow operation
observed for _collection_list, latency = 63.223s, lat = 63s cid
=44.7d_head start
#44:beffc78d:::rbd_data.1e48e8ab988992.00000000000011bd:0# end #MAX# max
2147483647
2020-08-28 02:21:20.774 7f03f1384700 1 heartbeat_map reset_timeout
'OSD::osd_op_tp thread 0x7f03f1384700' had timed out after 15
2020-08-28 02:21:20.805 7f03f1384700 0 log_channel(cluster) log [DBG] :
44.7d scrub ok
2020-08-28 02:21:21.099 7f03fd997700 0 log_channel(cluster) log [WRN] :
Monitor daemon marked osd.16 down, but it is still running
2020-08-28 02:21:21.099 7f03fd997700 0 log_channel(cluster) log [DBG] :
map e609411 wrongly marked me down at e609410
2020-08-28 02:21:21.099 7f03fd997700 1 osd.16 609411
start_waiting_for_healthy
2020-08-28 02:21:21.119 7f03fd997700 1 osd.16 609411 start_boot
2020-08-28 02:21:21.124 7f03f0b83700 1 osd.16 pg_epoch: 609410
pg[36.3d0( v 609409'481293 (449368'478292,609409'481293]
local-lis/les=609403/609404 n=154651 ec=435353/435353 lis/c
609403/609403 les/c/f 609404/609404/0 609410/609410/608752) [25,72] r=-1
lpr=609410 pi=[609403,609410)/1 luod=0'0 lua=609392'481198
crt=609409'481293 lcod 609409'481292 active mbc={}]
start_peering_interval up [25,72,16] -> [25,72], acting [25,72,16] ->
[25,72], acting_primary 25 -> 25, up_primary 25 -> 25, role 2 -> -1,
features acting 4611087854031667199 upacting 4611087854031667199
...
2020-08-28 02:21:21.166 7f03f0b83700 1 osd.16 pg_epoch: 609411
pg[36.56( v 609409'480511 (449368'477424,609409'480511]
local-lis/les=609403/609404 n=153854 ec=435353/435353 lis/c
609403/609403 les/c/f 609404/609404/0 609410/609410/609410) [103,102]
r=-1 lpr=609410 pi=[609403,609410)/1 crt=609409'480511 lcod
609409'480510 unknown NOTIFY mbc={}] state<Start>: transitioning to Stray
2020-08-28 02:21:21.307 7f04073b0700 1 osd.16 609413 set_numa_affinity
public network em1 numa node 0
2020-08-28 02:21:21.307 7f04073b0700 1 osd.16 609413 set_numa_affinity
cluster network em2 numa node 0
2020-08-28 02:21:21.307 7f04073b0700 1 osd.16 609413 set_numa_affinity
objectstore and network numa nodes do not match
2020-08-28 02:21:21.307 7f04073b0700 1 osd.16 609413 set_numa_affinity
not setting numa affinity
2020-08-28 02:21:21.566 7f040a435700 1 osd.16 609413 tick checking mon
for new map
2020-08-28 02:21:22.515 7f03fd997700 1 osd.16 609414 state: booting ->
active
2020-08-28 02:21:22.515 7f03f0382700 1 osd.16 pg_epoch: 609414
pg[36.20( v 609409'483167 (449368'480117,609409'483167]
local-lis/les=609403/609404 n=155171 ec=435353/435353 lis/c
609403/609403 les/c/f 609404/609404/0 609414/609414/609361) [97,16,72]
r=1 lpr=609414 pi=[609403,609414)/1 crt=609409'483167 lcod 609409'483166
unknown NOTIFY mbc={}] start_peering_interval up [97,72] -> [97,16,72],
acting [97,72] -> [97,16,72], acting_primary 97 -> 97, up_primary 97 ->
97, role -1 -> 1, features acting 4611087854031667199 upacting
4611087854031667199
...
2020-08-28 02:21:22.522 7f03f1384700 1 osd.16 pg_epoch: 609414
pg[36.2f3( v 609409'479796 (449368'476712,609409'479796]
local-lis/les=609403/609404 n=154451 ec=435353/435353 lis/c
609403/609403 les/c/f 609404/609404/0 609414/609414/609414) [16,34,21]
r=0 lpr=609414 pi=[609403,609414)/1 crt=609409'479796 lcod 609409'479795
mlcod 0'0 unknown NOTIFY mbc={}] start_peering_interval up [34,21] ->
[16,34,21], acting [34,21] -> [16,34,21], acting_primary 34 -> 16,
up_primary 34 -> 16, role -1 -> 0, features acting 4611087854031667199
upacting 4611087854031667199
2020-08-28 02:21:22.522 7f03f1384700 1 osd.16 pg_epoch: 609414
pg[36.2f3( v 609409'479796 (449368'476712,609409'479796]
local-lis/les=609403/609404 n=154451 ec=435353/435353 lis/c
609403/609403 les/c/f 609404/609404/0 609414/609414/609414) [16,34,21]
r=0 lpr=609414 pi=[609403,609414)/1 crt=609409'479796 lcod 609409'479795
mlcod 0'0 unknown mbc={}] state<Start>: transitioning to Primary
2020-08-28 02:21:24.738 7f03f1384700 0 log_channel(cluster) log [DBG] :
36.2f3 scrub starts
2020-08-28 02:22:18.857 7f03f1384700 0 log_channel(cluster) log [DBG] :
36.2f3 scrub ok