You meta data PGs *are* backfilling. It is the
"61 keys/s" statement
in the ceph status output in the recovery I/O line. If this is too
slow, increase osd_max_backfills and osd_recovery_max_active.
Or just have some coffee ...
I already had increased osd_max_backfills and osd_recovery_max_active
in order to speed things up, and most of the PGs were remapped pretty
quick (couple of minutes), but these last 3 PGs took almost two hours
to complete, which was unexpected.
Zitat von Frank Schilder <frans(a)dtu.dk>dk>:
You meta data PGs *are* backfilling. It is the
"61 keys/s" statement
in the ceph status output in the recovery I/O line. If this is too
slow, increase osd_max_backfills and osd_recovery_max_active.
Or just have some coffee ...
>
> Best regards,
>
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Eugen Block <eblock(a)nde.ag>
> Sent: 10 October 2019 14:54:37
> To: ceph-users(a)ceph.io
> Subject: [ceph-users] Nautilus: PGs stuck remapped+backfilling
>
> Hi all,
>
> I have a strange issue with backfilling and I'm not sure what the cause is.
> It's a Nautilus cluster (upgraded) that has an SSD cache tier for
> OpenStack and CephFS metadata residing on the same SSDs, there were
> three SSDs in total.
> Today I added two new SSDs (NVMe) (osd.15, osd.16) to be able to
> shutoff one old server that has only one SSD-OSD left (osd.20).
> Setting the crush weight of osd.20 to 0 (and adjusting the weight of
> the remaining SSDs for an even distribution) leaves 3 PGs in
> active+remapped+backfilling state. I don't understand why the
> remaining PGs aren't backfilling, the crush rule is quite simple (all
> ssd pools are replicated with size 3). The backfilling PGs are all
> from the cephfs_metadata pool. Although there are 4 SSDs for 3
> replicas the backfilling still should finish, right?
>
> Can anyone share their thoughts why 3 PGs can't be recovered? If more
> information about the cluster is required please let me know.
>
> Regards,
> Eugen
>
>
> ceph01:~ # ceph osd pool ls detail | grep meta
> pool 36 'cephfs-metadata' replicated size 3 min_size 2 crush_rule 1
> object_hash rjenkins pg_num 16 pgp_num 16 last_change 283362 flags
> hashpspool,nodelete,nodeep-scrub stripe_width 0 application cephfs
>
>
> ceph01:~ # ceph pg dump | grep remapp
> dumped all
> 36.b 28306 0 0 28910 0
> 8388608 101408323 219497 3078 3078
> active+remapped+backfilling 2019-10-10 13:36:27.427527
> 284595'98565869 284595:254216941 [15,16,9] 15
> [20,9,10] 20 284427'98489406 2019-10-10
> 00:16:02.682911 284089'98003598 2019-10-06 16:03:27.558267
> 0
> 36.d 28087 0 0 25327 0
> 26375382 106722204 231020 3041 3041
> active+remapped+backfilling 2019-10-10 13:36:27.404739
> 284595'97933905 284595:252878816 [16,15,9] 16
> [20,9,10] 20 284427'97887652 2019-10-10
> 04:13:29.371905 284259'97502135 2019-10-07 20:06:43.304593
> 0
> 36.4 28060 0 0 28406 0
> 8389242 104059103 225188 3061 3061
> active+remapped+backfilling 2019-10-10 13:36:27.440390
> 284595'105299618 284595:312976619 [16,9,15] 16
> [20,9,10] 20 284427'105218591 2019-10-10
> 00:18:07.924006 284089'104696098 2019-10-06 16:20:17.123149
> 0
>
>
> rule ssd_ruleset {
> id 1
> type replicated
> min_size 1
> max_size 10
> step take default class ssd
> step chooseleaf firstn 0 type host
> step emit
> }
>
> This is the relevant part of the osd tree:
>
> ceph01:~ # ceph osd tree
> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
> -1 34.21628 root default
> -31 11.25406 host ceph01
> 25 hdd 3.59999 osd.25 up 1.00000 1.00000
> 26 hdd 3.59999 osd.26 up 1.00000 1.00000
> 27 hdd 3.59999 osd.27 up 1.00000 1.00000
> 15 ssd 0.45409 osd.15 up 1.00000 1.00000
> -34 11.25406 host ceph02
> 0 hdd 3.59999 osd.0 up 1.00000 1.00000
> 28 hdd 3.59999 osd.28 up 1.00000 1.00000
> 29 hdd 3.59999 osd.29 up 1.00000 1.00000
> 16 ssd 0.45409 osd.16 up 1.00000 1.00000
> -37 10.79999 host ceph03
> 31 hdd 3.59999 osd.31 up 1.00000 1.00000
> 32 hdd 3.59999 osd.32 up 1.00000 1.00000
> 33 hdd 3.59999 osd.33 up 1.00000 1.00000
> -24 0.45409 host san01-ssd
> 10 ssd 0.45409 osd.10 up 1.00000 1.00000
> -23 0.45409 host san02-ssd
> 9 ssd 0.45409 osd.9 up 1.00000 1.00000
> -22 0 host san03-ssd
> 20 ssd 0 osd.20 up 1.00000 1.00000
>
>
> Don't be confused because of the '-ssd' suffix, we're using crush
> location hooks.
> This is the current PG distribution on the SSDs:
>
> ceph01:~ # ceph osd df | grep -E "^15 |^16 |^ 9|^10 |^20 "
> 15 ssd 0.45409 1.00000 465 GiB 34 GiB 32 GiB 1.2 GiB 857 MiB 431
> GiB 7.29 0.22 27 up
> 16 ssd 0.45409 1.00000 465 GiB 37 GiB 34 GiB 1.5 GiB 964 MiB 428
> GiB 7.87 0.23 31 up
> 10 ssd 0.45409 1.00000 745 GiB 27 GiB 25 GiB 1.7 GiB 950 MiB 718
> GiB 3.65 0.11 29 up
> 9 ssd 0.45409 1.00000 745 GiB 34 GiB 32 GiB 1.3 GiB 902 MiB
> 711 GiB 4.60 0.14 30 up
> 20 ssd 0 1.00000 894 GiB 8.2 GiB 4.3 GiB 1.5 GiB 2.4 GiB 886
> GiB 0.91 0.03 3 up
>
>
> Current ceph status:
>
> ceph01:~ # ceph -s
> cluster:
> id: 655cb05a-435a-41ba-83d9-8549f7c36167
> health: HEALTH_OK
>
> services:
> mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 2d)
> mgr: ceph03(active, since 8d), standbys: ceph01, ceph02
> mds: cephfs:1 {0=mds01=up:active} 1 up:standby-replay 1 up:standby
> osd: 26 osds: 26 up (since 66m), 26 in (since 66m); 3 remapped pgs
>
> data:
> pools: 8 pools, 264 pgs
> objects: 4.96M objects, 5.0 TiB
> usage: 16 TiB used, 31 TiB / 47 TiB avail
> pgs: 115745/14865558 objects misplaced (0.779%)
> 261 active+clean
> 3 active+remapped+backfilling
>
> io:
> client: 903 KiB/s rd, 8.8 MiB/s wr, 85 op/s rd, 266 op/s wr
> recovery: 0 B/s, 61 keys/s, 12 objects/s
> cache: 4.2 MiB/s flush, 15 MiB/s evict, 0 op/s promote
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io