The output from ceph osd pool ls detail tell me nothing, except that the
pgp_num is not where it should be. Can you help me to read the output? How
do I estimate how long the split will take?
[root@s3db1 ~]# ceph status
cluster:
id: dca79fff-ffd0-58f4-1cff-82a2feea05f4
health: HEALTH_WARN
noscrub,nodeep-scrub flag(s) set
10 backfillfull osd(s)
19 nearfull osd(s)
37 pool(s) backfillfull
BlueFS spillover detected on 1 OSD(s)
13 large omap objects
Low space hindering backfill (add storage if this doesn't
resolve itself): 234 pgs backfill_toofull
...
data:
pools: 37 pools, 4032 pgs
objects: 121.40M objects, 199 TiB
usage: 627 TiB used, 169 TiB / 795 TiB avail
pgs: 45263471/364213596 objects misplaced (12.428%)
3719 active+clean
209 active+remapped+backfill_wait+backfill_toofull
59 active+remapped+backfill_wait
24 active+remapped+backfill_toofull
20 active+remapped+backfilling
1 active+remapped+forced_backfill+backfill_toofull
io:
client: 8.4 MiB/s rd, 127 MiB/s wr, 208 op/s rd, 163 op/s wr
recovery: 276 MiB/s, 164 objects/s
[root@s3db1 ~]# ceph osd pool ls detail
...
pool 10 'eu-central-1.rgw.buckets.index' replicated size 3 min_size 1
crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode warn
last_change 320966 lfor 0/193276/306366 flags hashpspool,backfillfull
stripe_width 0 application rgw
pool 11 'eu-central-1.rgw.buckets.data' replicated size 3 min_size 2
crush_rule 0 object_hash rjenkins pg_num 2048 pgp_num 1946 pgp_num_target
2048 autoscale_mode warn last_change 320966 lfor 0/263549/317774 flags
hashpspool,backfillfull stripe_width 0 application rgw
...
Am Di., 30. März 2021 um 15:07 Uhr schrieb Dan van der Ster <
dan(a)vanderster.com>gt;:
It would be safe to turn off the balancer, yes go
ahead.
To know if adding more hardware will help, we need to see how much
longer this current splitting should take. This will help:
ceph status
ceph osd pool ls detail
-- dan
On Tue, Mar 30, 2021 at 3:00 PM Boris Behrens <bb(a)kervyn.de> wrote:
I would think due to splitting, because the balancer doesn't refuses
it's
work, because to many misplaced objects.
I also think to turn it off for now, so it
doesn't begin it's work at 5%
missplaced objects.
Would adding more hardware help? We wanted to insert another OSD node
with 7x8TB
disks anyway, but postponed it due to the rebalancing.
Am Di., 30. März 2021 um 14:23 Uhr schrieb Dan van der Ster <
dan(a)vanderster.com>gt;:
>
> Are those PGs backfilling due to splitting or due to balancing?
> If it's the former, I don't think there's a way to pause them with
> upmap or any other trick.
>
> -- dan
>
> On Tue, Mar 30, 2021 at 2:07 PM Boris Behrens <bb(a)kervyn.de> wrote:
> >
> > One week later the ceph is still balancing.
> > What worries me like hell is the %USE on a lot of those OSDs. Does
ceph
> > resolv this on it's own? We are
currently down to 5TB space in the
cluster.
> > Rebalancing single OSDs doesn't work
well and it increases the
"missplaced
> > objects".
> >
> > I thought about letting upmap do some rebalancing. Anyone know if
this is
a
> > good idea? Or if I should bite my nails
an wait as I am the headache
of my
> > life.
> > [root@s3db1 ~]# ceph osd getmap -o om; osdmaptool om --upmap out.txt
> > --upmap-pool eu-central-1.rgw.buckets.data --upmap-max 10; cat out.txt
> > got osdmap epoch 321975
> > osdmaptool: osdmap file 'om'
> > writing upmap command output to: out.txt
> > checking for upmap cleanups
> > upmap, max-count 10, max deviation 5
> > limiting to pools eu-central-1.rgw.buckets.data ([11])
> > pools eu-central-1.rgw.buckets.data
> > prepared 10/10 changes
> > ceph osd rm-pg-upmap-items 11.209
> > ceph osd rm-pg-upmap-items 11.253
> > ceph osd pg-upmap-items 11.7f 79 88
> > ceph osd pg-upmap-items 11.fc 53 31 105 78
> > ceph osd pg-upmap-items 11.1d8 84 50
> > ceph osd pg-upmap-items 11.47f 94 86
> > ceph osd pg-upmap-items 11.49c 44 71
> > ceph osd pg-upmap-items 11.553 74 50
> > ceph osd pg-upmap-items 11.6c3 66 63
> > ceph osd pg-upmap-items 11.7ad 43 50
> >
> > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META
> > AVAIL %USE VAR PGS STATUS TYPE NAME
> > -1 795.42548 - 795 TiB 626 TiB 587 TiB 82 GiB 1.4
TiB
170
> > TiB 78.64 1.00 - root default
> > 56 hdd 7.32619 1.00000 7.3 TiB 6.4 TiB 6.4 TiB 684 MiB 16
GiB
910
> > GiB 87.87 1.12 129 up
osd.56
> > 67 hdd 7.27739 1.00000 7.3 TiB 6.4 TiB 6.4 TiB 582 MiB 16
GiB
865
> > GiB 88.40 1.12 115 up
osd.67
> > 79 hdd 3.63689 1.00000 3.6 TiB 3.2 TiB 432 GiB 1.9 GiB 0
B
432
> > GiB 88.40 1.12 63 up
osd.79
> > 53 hdd 7.32619 1.00000 7.3 TiB 6.5 TiB 6.4 TiB 971 MiB 22
GiB
864
> > GiB 88.48 1.13 114 up
osd.53
> > 51 hdd 7.27739 1.00000 7.3 TiB 6.5 TiB 6.4 TiB 734 MiB 15
GiB
837
> > GiB 88.77 1.13 120 up
osd.51
> > 73 hdd 14.55269 1.00000 15 TiB 13 TiB 13 TiB 1.8 GiB 39
GiB
1.6
> > TiB 88.97 1.13 246 up
osd.73
> > 55 hdd 7.32619 1.00000 7.3 TiB 6.5 TiB 6.5 TiB 259 MiB 15
GiB
825
> > GiB 89.01 1.13 118 up
osd.55
> > 70 hdd 7.27739 1.00000 7.3 TiB 6.5 TiB 6.5 TiB 291 MiB 16
GiB
787
> > GiB 89.44 1.14 119 up
osd.70
> > 42 hdd 3.73630 1.00000 3.7 TiB 3.4 TiB 3.3 TiB 685 MiB 8.2
GiB
374
> > GiB 90.23 1.15 60 up
osd.42
> > 94 hdd 3.63869 1.00000 3.6 TiB 3.3 TiB 3.3 TiB 132 MiB 7.7
GiB
345
> > GiB 90.75 1.15 64 up
osd.94
> > 25 hdd 3.73630 1.00000 3.7 TiB 3.4 TiB 3.3 TiB 3.2 MiB 8.1
GiB
352
> > GiB 90.79 1.15 53 up
osd.25
> > 31 hdd 7.32619 1.00000 7.3 TiB 6.7 TiB 6.6 TiB 223 MiB 15
GiB
690
> > GiB 90.80 1.15 117 up
osd.31
> > 84 hdd 7.52150 1.00000 7.5 TiB 6.8 TiB 6.6 TiB 159 MiB 16
GiB
699
> > GiB 90.93 1.16 121 up
osd.84
> > 82 hdd 3.63689 1.00000 3.6 TiB 3.3 TiB 332 GiB 1.0 GiB 0
B
332
> > GiB 91.08 1.16 59 up
osd.82
> > 89 hdd 7.52150 1.00000 7.5 TiB 6.9 TiB 6.6 TiB 400 MiB 15
GiB
670
> > GiB 91.29 1.16 126 up
osd.89
> > 33 hdd 3.73630 1.00000 3.7 TiB 3.4 TiB 3.3 TiB 382 MiB 8.6
GiB
327
> > GiB 91.46 1.16 66 up
osd.33
> > 90 hdd 7.52150 1.00000 7.5 TiB 6.9 TiB 6.6 TiB 338 MiB 15
GiB
658
> > GiB 91.46 1.16 112 up
osd.90
> > 105 hdd 3.63869 0.89999 3.6 TiB 3.3 TiB 3.3 TiB 206 MiB 8.1
GiB
301
> > GiB 91.91 1.17 56 up
osd.105
> > 66 hdd 7.27739 0.95000 7.3 TiB 6.7 TiB 6.7 TiB 322 MiB 16
GiB
548
> > GiB 92.64 1.18 121 up
osd.66
> > 46 hdd 7.27739 1.00000 7.3 TiB 6.8 TiB 6.7 TiB 316 MiB 16
GiB
536
> > GiB 92.81 1.18 119 up
osd.46
> >
> > Am Di., 23. März 2021 um 19:59 Uhr schrieb Boris Behrens <
bb(a)kervyn.de>gt;:
> >
> > > Good point. Thanks for the hint. I changed it for all OSDs from 5
to
1
> > > *crossing finger*
> > >
> > > Am Di., 23. März 2021 um 19:45 Uhr schrieb Dan van der Ster <
> > > dan(a)vanderster.com>gt;:
> > >
> > >> I see. When splitting PGs, the OSDs will increase is used space
> > >> temporarily to make room for the new PGs.
> > >> When going from 1024->2048 PGs, that means that half of the
objects from
> > >> each PG will be copied to a new
PG, and then the previous PGs will
have
> > >> those objects deleted.
> > >>
> > >> Make sure osd_max_backfills is set to 1, so that not too many PGs
are
> > >> moving concurrently.
> > >>
> > >>
> > >>
> > >> On Tue, Mar 23, 2021, 7:39 PM Boris Behrens <bb(a)kervyn.de>
wrote:
> > >>
> > >>> Thank you.
> > >>> Currently I do not have any full OSDs (all <90%) but I keep this
in mind.
> > >>> What worries me is the ever
increasing %USE metric (it went up
from
> > >>> around 72% to 75% in three
hours). It looks like there is comming
a lot of
> > >>> data (there comes barely
new data at the moment), but I think
this might
> > >>> have to do with my
"let's try to increase the PGs to 2048". I
hope that
> > >>> ceph begins to split the
old PGs into new ones and removes the
old PGs.
> > >>>
> > >>> ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META
> > >>> AVAIL %USE VAR PGS STATUS TYPE NAME
> > >>> -1 795.42548 - 795 TiB 597 TiB 556 TiB 88 GiB 1.4
TiB
> > >>> 198 TiB 75.12 1.00 -
root default
> > >>>
> > >>> Am Di., 23. März 2021 um 19:21 Uhr schrieb Dan van der Ster <
> > >>> dan(a)vanderster.com>gt;:
> > >>>
> > >>>> While you're watching things, if an OSD is getting too
close for
> > >>>> comfort to the full ratio, you can temporarily increase it,
e.g.
> > >>>> ceph osd set-full-ratio 0.96
> > >>>>
> > >>>> But don't set that too high -- you can really break an OSD
if it
gets
> > >>>> 100% full (and then
can't delete objects or whatever...)
> > >>>>
> > >>>> -- dan
> > >>>>
> > >>>> On Tue, Mar 23, 2021 at 7:17 PM Boris Behrens
<bb(a)kervyn.de>
wrote:
> > >>>> >
> > >>>> > Ok, then I will try to reweight the most filled OSDs to
.95
and see
> > >>>> if this helps.
> > >>>> >
> > >>>> > Am Di., 23. März 2021 um 19:13 Uhr schrieb Dan van der
Ster <
> > >>>> dan(a)vanderster.com>gt;:
> > >>>> >>
> > >>>> >> Data goes to *all* PGs uniformly.
> > >>>> >> Max_avail is limited by the available space on the
most full
OSD --
> > >>>> >> you should pay
close attention to those and make sure they
are moving
> > >>>> >> in the right
direction (decreasing!)
> > >>>> >>
> > >>>> >> Another point -- IMHO you should aim to get all PGs
active+clean
> > >>>> >> before you add
yet another batch of new disks. While there
are PGs
> > >>>> >> backfilling,
your osdmaps are accumulating on the mons and
osds --
> > >>>> >> this itself
will start to use a lot of space, and
active+clean is the
> > >>>> >> only way to
trim the old maps.
> > >>>> >>
> > >>>> >> -- dan
> > >>>> >>
> > >>>> >> On Tue, Mar 23, 2021 at 7:05 PM Boris Behrens
<bb(a)kervyn.de>
wrote:
> > >>>> >> >
> > >>>> >> > So,
> > >>>> >> > doing nothing and wait for the ceph to recover?
> > >>>> >> >
> > >>>> >> > In theory there should be enough disk space (more
disks
arriving
> > >>>> tomorrow), but I fear
that there might be an issue, when the
backups get
> > >>>> exported over night to
this s3. Currently the max_avail lingers
around 13TB
> > >>>> and I hope, that the
data will go to other PGs than the ones
that are
> > >>>> currently on filled
OSDs.
> > >>>> >> >
> > >>>> >> >
> > >>>> >> >
> > >>>> >> > Am Di., 23. März 2021 um 18:58 Uhr schrieb Dan
van der Ster
<
> > >>>>
dan(a)vanderster.com>gt;:
> > >>>> >> >>
> > >>>> >> >> Hi,
> > >>>> >> >>
> > >>>> >> >> backfill_toofull is not a bad thing when the
cluster is
really
> > >>>> full
> > >>>> >> >> like yours. You should expect some of the
most full OSDs to
> > >>>> eventually
> > >>>> >> >> start decreasing in usage, as the PGs are
moved to the new
OSDs.
> > >>>> Those
> > >>>> >> >> backfill_toofull states should then resolve
themselves as
the OSD
> > >>>> >> >> usage
flattens out.
> > >>>> >> >> Keep an eye on the usage of the backfill_full
and nearfull
OSDs
> > >>>> though
> > >>>> >> >> -- if they do eventually go above the
full_ratio (95% by
default),
> > >>>> >> >> then
writes to those OSDs would stop.
> > >>>> >> >>
> > >>>> >> >> But if on the other hand you're suffering
from lots of
slow ops or
> > >>>> >> >>
anything else visible to your users, then you could try to
take
> > >>>> some
> > >>>> >> >> actions to slow down the rebalancing. Just
let us know if
that's
> > >>>> the
> > >>>> >> >> case and we can see about changing
osd_max_backfills, some
> > >>>> weights or
> > >>>> >> >> maybe using the upmap-remapped tool.
> > >>>> >> >>
> > >>>> >> >> -- Dan
> > >>>> >> >>
> > >>>> >> >> On Tue, Mar 23, 2021 at 6:07 PM Boris Behrens
<
bb(a)kervyn.de>
> > >>>> wrote:
> > >>>> >> >> >
> > >>>> >> >> > Ok, I should have listened to you :)
> > >>>> >> >> >
> > >>>> >> >> > In the last week we added more storage
but the issue got
worse
> > >>>> instead.
> > >>>> >> >> > Today I realized that the PGs were up to
90GB (bytes
column in
> > >>>> ceph pg ls said
95705749636), and the balance kept mentioning
the 2048 PGs
> > >>>> for this pool. We were
at 72% utilization (ceph osd df tree,
first line)
> > >>>> for our cluster and I
increased the PGs to 2048.
> > >>>> >> >> >
> > >>>> >> >> > Now I am in a world of trouble.
> > >>>> >> >> > The space in the cluster went down, I am
at 45% misplaced
> > >>>> objects, and we already added 20x4TB disks just to not go down
completly.
> > >>>> >> >> >
> > >>>> >> >> > The utilization is still going up and
the overall free
space in
> > >>>> the cluster seems to go
down. This is what my ceph status looks
like and
> > >>>> now I really need help
to get that thing back to normal:
> > >>>> >> >> > [root@s3db1 ~]# ceph status
> > >>>> >> >> > cluster:
> > >>>> >> >> > id:
dca79fff-ffd0-58f4-1cff-82a2feea05f4
> > >>>> >> >> > health: HEALTH_WARN
> > >>>> >> >> > 4 backfillfull osd(s)
> > >>>> >> >> > 17 nearfull osd(s)
> > >>>> >> >> > 37 pool(s) backfillfull
> > >>>> >> >> > 13 large omap objects
> > >>>> >> >> > Low space hindering backfill
(add storage if
this
> > >>>> doesn't resolve
itself): 570 pgs backfill_toofull
> > >>>> >> >> >
> > >>>> >> >> > services:
> > >>>> >> >> > mon: 3 daemons, quorum
> > >>>> ceph-s3-mon1,ceph-s3-mon2,ceph-s3-mon3 (age 44m)
> > >>>> >> >> > mgr: ceph-mgr2(active, since 15m),
standbys:
ceph-mgr3,
> > >>>> ceph-mgr1
> > >>>> >> >> > mds: 3 up:standby
> > >>>> >> >> > osd: 110 osds: 110 up (since 28m),
110 in (since
28m); 1535
> > >>>> remapped pgs
> > >>>> >> >> > rgw: 3 daemons active (eu-central-1,
eu-msg-1,
eu-secure-1)
> > >>>> >> >> >
> > >>>> >> >> > task status:
> > >>>> >> >> >
> > >>>> >> >> > data:
> > >>>> >> >> > pools: 37 pools, 4032 pgs
> > >>>> >> >> > objects: 116.23M objects, 182 TiB
> > >>>> >> >> > usage: 589 TiB used, 206 TiB / 795
TiB avail
> > >>>> >> >> > pgs: 160918554/348689415 objects
misplaced
(46.150%)
> > >>>> >> >> >
2497 active+clean
> > >>>> >> >> > 779
active+remapped+backfill_wait
> > >>>> >> >> > 538
active+remapped+backfill_wait+backfill_toofull
> > >>>> >> >> >
186 active+remapped+backfilling
> > >>>> >> >> > 32
active+remapped+backfill_toofull
> > >>>> >> >> >
> > >>>> >> >> > io:
> > >>>> >> >> > client: 27 MiB/s rd, 69 MiB/s wr,
497 op/s rd, 153
op/s wr
> > >>>> >> >> >
recovery: 1.5 GiB/s, 922 objects/s
> > >>>> >> >> >
> > >>>> >> >> > Am Di., 16. März 2021 um 09:34 Uhr
schrieb Boris Behrens
<
> > >>>> bb(a)kervyn.de>gt;:
> > >>>> >> >> >>
> > >>>> >> >> >> Hi Dan,
> > >>>> >> >> >>
> > >>>> >> >> >> my EC profile look very
"default" to me.
> > >>>> >> >> >> [root@s3db1 ~]# ceph osd
erasure-code-profile ls
> > >>>> >> >> >> default
> > >>>> >> >> >> [root@s3db1 ~]# ceph osd
erasure-code-profile get
default
> > >>>> >> >>
>> k=2
> > >>>> >> >> >> m=1
> > >>>> >> >> >> plugin=jerasure
> > >>>> >> >> >> technique=reed_sol_van
> > >>>> >> >> >>
> > >>>> >> >> >> I don't understand the ouput,
but the balancing get
worse over
> > >>>> night:
> > >>>> >> >> >>
> > >>>> >> >> >> [root@s3db1 ~]#
ceph-scripts/tools/ceph-pool-pg-distribution
> > >>>> 11
> > >>>> >> >> >> Searching for PGs in pools:
['11']
> > >>>> >> >> >> Summary: 1024 PGs on 84 osds
> > >>>> >> >> >>
> > >>>> >> >> >> Num OSDs with X PGs:
> > >>>> >> >> >> 15: 8
> > >>>> >> >> >> 16: 7
> > >>>> >> >> >> 17: 6
> > >>>> >> >> >> 18: 10
> > >>>> >> >> >> 19: 1
> > >>>> >> >> >> 32: 10
> > >>>> >> >> >> 33: 4
> > >>>> >> >> >> 34: 6
> > >>>> >> >> >> 35: 8
> > >>>> >> >> >> 65: 5
> > >>>> >> >> >> 66: 5
> > >>>> >> >> >> 67: 4
> > >>>> >> >> >> 68: 10
> > >>>> >> >> >> [root@s3db1 ~]#
ceph-scripts/tools/ceph-pg-histogram
> > >>>> --normalize --pool=11
> > >>>> >> >> >> # NumSamples = 84; Min = 4.12; Max =
5.09
> > >>>> >> >> >> # Mean = 4.553355; Variance =
0.052415; SD = 0.228942;
Median
> > >>>> 4.561608
> > >>>> >> >> >> # each ∎ represents a count of 1
> > >>>> >> >> >> 4.1244 - 4.2205 [ 8]:
∎∎∎∎∎∎∎∎
> > >>>> >> >> >> 4.2205 - 4.3166 [ 6]:
∎∎∎∎∎∎
> > >>>> >> >> >> 4.3166 - 4.4127 [ 11]:
∎∎∎∎∎∎∎∎∎∎∎
> > >>>> >> >> >> 4.4127 - 4.5087 [ 10]:
∎∎∎∎∎∎∎∎∎∎
> > >>>> >> >> >> 4.5087 - 4.6048 [ 11]:
∎∎∎∎∎∎∎∎∎∎∎
> > >>>> >> >> >> 4.6048 - 4.7009 [ 19]:
∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
> > >>>> >> >> >> 4.7009 - 4.7970 [ 6]:
∎∎∎∎∎∎
> > >>>> >> >> >> 4.7970 - 4.8931 [ 8]:
∎∎∎∎∎∎∎∎
> > >>>> >> >> >> 4.8931 - 4.9892 [ 4]:
∎∎∎∎
> > >>>> >> >> >> 4.9892 - 5.0852 [ 1]: ∎
> > >>>> >> >> >> [root@s3db1 ~]# ceph osd df tree |
sort -nk 17 | tail
> > >>>> >> >> >> 14 hdd 3.63689 1.00000 3.6 TiB
2.9 TiB 724 GiB
19 GiB
> > >>>> 0 B 724 GiB 80.56
1.07 56 up osd.14
> > >>>> >> >> >> 19 hdd 3.68750 1.00000 3.7 TiB
3.0 TiB 2.9 TiB
466 MiB
> > >>>> 7.9 GiB 708 GiB 81.25
1.08 53 up osd.19
> > >>>> >> >> >> 4 hdd 3.63689 1.00000 3.6 TiB
3.0 TiB 698 GiB
703 MiB
> > >>>> 0 B 698 GiB 81.27
1.08 48 up osd.4
> > >>>> >> >> >> 24 hdd 3.63689 1.00000 3.6 TiB
3.0 TiB 695 GiB
640 MiB
> > >>>> 0 B 695 GiB 81.34
1.08 46 up osd.24
> > >>>> >> >> >> 75 hdd 3.68750 1.00000 3.7 TiB
3.0 TiB 2.9 TiB
440 MiB
> > >>>> 8.1 GiB 704 GiB 81.35
1.08 48 up osd.75
> > >>>> >> >> >> 71 hdd 3.68750 1.00000 3.7 TiB
3.0 TiB 3.0 TiB
7.5 MiB
> > >>>> 8.0 GiB 663 GiB 82.44
1.09 47 up osd.71
> > >>>> >> >> >> 76 hdd 3.68750 1.00000 3.7 TiB
3.1 TiB 3.0 TiB
251 MiB
> > >>>> 9.0 GiB 617 GiB 83.65
1.11 50 up osd.76
> > >>>> >> >> >> 33 hdd 3.73630 1.00000 3.7 TiB
3.1 TiB 3.0 TiB
399 MiB
> > >>>> 8.1 GiB 618 GiB 83.85
1.11 55 up osd.33
> > >>>> >> >> >> 35 hdd 3.73630 1.00000 3.7 TiB
3.1 TiB 3.0 TiB
317 MiB
> > >>>> 8.8 GiB 617 GiB 83.87
1.11 50 up osd.35
> > >>>> >> >> >> 34 hdd 3.73630 1.00000 3.7 TiB
3.2 TiB 3.1 TiB
451 MiB
> > >>>> 8.7 GiB 545 GiB 85.75
1.14 54 up osd.34
> > >>>> >> >> >>
> > >>>> >> >> >> Am Mo., 15. März 2021 um 17:23 Uhr
schrieb Dan van der
Ster <
> > >>>>
dan(a)vanderster.com>gt;:
> > >>>> >> >> >>>
> > >>>> >> >> >>> Hi,
> > >>>> >> >> >>>
> > >>>> >> >> >>> How wide are your EC profiles?
If they are really
wide, you
> > >>>> might be
> > >>>> >> >> >>> reaching the limits of what is
physically possible.
Also, I'm
> > >>>> not sure
> > >>>> >> >> >>> that upmap in 14.2.11 is very
smart about *improving*
> > >>>> existing upmap
> > >>>> >> >> >>> rules for a given PG, in the
case that a PG already
has an
> > >>>> upmap-items
> > >>>> >> >> >>> entry but it would help the
distribution to add more
mapping
> > >>>> pairs to
> > >>>> >> >> >>> that entry. What this means, is
that it might
sometimes be
> > >>>> useful to
> > >>>> >> >> >>> randomly remove some upmap
entries and see if the
balancer
> > >>>> does a
> > >>>> >> >> >>> better job when it replaces
them.
> > >>>> >> >> >>>
> > >>>> >> >> >>> But before you do that, I
re-remembered that looking
at the
> > >>>> total PG
> > >>>> >> >> >>> numbers is not useful -- you
need to check the PGs per
OSD
> > >>>> for the
> > >>>> >> >> >>> eu-central-1.rgw.buckets.data
pool only.
> > >>>> >> >> >>>
> > >>>> >> >> >>> We have a couple tools that can
help with this:
> > >>>> >> >> >>>
> > >>>> >> >> >>> 1. To see the PGs per OSD for a
given pool:
> > >>>> >> >> >>>
> > >>>>
https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-pool-pg-dis…
> > >>>> >> >>
>>>
> > >>>> >> >> >>> E.g.:
./ceph-pool-pg-distribution 11 # to see the
> > >>>> distribution of
> > >>>> >> >> >>> your
eu-central-1.rgw.buckets.data pool.
> > >>>> >> >> >>>
> > >>>> >> >> >>> The output looks like this
on my well balanced
clusters:
> > >>>> >> >>
>>>
> > >>>> >> >> >>> #
ceph-scripts/tools/ceph-pool-pg-distribution 15
> > >>>> >> >> >>> Searching for PGs in pools:
['15']
> > >>>> >> >> >>> Summary: 256 pgs on 56 osds
> > >>>> >> >> >>>
> > >>>> >> >> >>> Num OSDs with X PGs:
> > >>>> >> >> >>> 13: 16
> > >>>> >> >> >>> 14: 40
> > >>>> >> >> >>>
> > >>>> >> >> >>> You should expect a trimodal
for your cluster.
> > >>>> >> >> >>>
> > >>>> >> >> >>> 2. You can also use another
script from that repo to
see the
> > >>>> PGs per
> > >>>> >> >> >>> OSD normalized to crush weight:
> > >>>> >> >> >>>
ceph-scripts/tools/ceph-pg-histogram --normalize
--pool=15
> > >>>> >> >>
>>>
> > >>>> >> >> >>> This might explain what is
going wrong.
> > >>>> >> >> >>>
> > >>>> >> >> >>> Cheers, Dan
> > >>>> >> >> >>>
> > >>>> >> >> >>>
> > >>>> >> >> >>> On Mon, Mar 15, 2021 at 3:04 PM
Boris Behrens <
bb(a)kervyn.de>
> > >>>> wrote:
> > >>>> >> >> >>> >
> > >>>> >> >> >>> > Absolutly:
> > >>>> >> >> >>> > [root@s3db1 ~]# ceph osd df
tree
> > >>>> >> >> >>> > ID CLASS WEIGHT
REWEIGHT SIZE RAW USE DATA
OMAP
> > >>>> META AVAIL
%USE VAR PGS STATUS TYPE NAME
> > >>>> >> >> >>> > -1 673.54224
- 674 TiB 496 TiB 468
TiB 97
> > >>>> GiB 1.2 TiB 177 TiB
73.67 1.00 - root default
> > >>>> >> >> >>> > -2 58.30331
- 58 TiB 42 TiB 38
TiB 9.2
> > >>>> GiB 99 GiB 16 TiB
72.88 0.99 - host s3db1
> > >>>> >> >> >>> > 23 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 714
> > >>>> MiB 25 GiB 3.7 TiB
74.87 1.02 194 up osd.23
> > >>>> >> >> >>> > 69 hdd 14.55269
1.00000 15 TiB 11 TiB 11
TiB 1.6
> > >>>> GiB 40 GiB 3.4 TiB
76.32 1.04 199 up osd.69
> > >>>> >> >> >>> > 73 hdd 14.55269
1.00000 15 TiB 11 TiB 11
TiB 1.3
> > >>>> GiB 34 GiB 3.8 TiB
74.15 1.01 203 up osd.73
> > >>>> >> >> >>> > 79 hdd 3.63689
1.00000 3.6 TiB 2.4 TiB 1.3
TiB 1.8
> > >>>> GiB 0 B 1.3 TiB
65.44 0.89 47 up osd.79
> > >>>> >> >> >>> > 80 hdd 3.63689
1.00000 3.6 TiB 2.4 TiB 1.3
TiB 2.2
> > >>>> GiB 0 B 1.3 TiB
65.34 0.89 48 up osd.80
> > >>>> >> >> >>> > 81 hdd 3.63689
1.00000 3.6 TiB 2.4 TiB 1.3
TiB 1.1
> > >>>> GiB 0 B 1.3 TiB
65.38 0.89 47 up osd.81
> > >>>> >> >> >>> > 82 hdd 3.63689
1.00000 3.6 TiB 2.5 TiB 1.1
TiB 619
> > >>>> MiB 0 B 1.1 TiB
68.46 0.93 41 up osd.82
> > >>>> >> >> >>> > -11 50.94173
- 51 TiB 37 TiB 37
TiB 3.5
> > >>>> GiB 98 GiB 14 TiB
71.90 0.98 - host s3db10
> > >>>> >> >> >>> > 63 hdd 7.27739
1.00000 7.3 TiB 5.3 TiB 5.3
TiB 647
> > >>>> MiB 14 GiB 2.0 TiB
72.72 0.99 94 up osd.63
> > >>>> >> >> >>> > 64 hdd 7.27739
1.00000 7.3 TiB 5.3 TiB 5.2
TiB 668
> > >>>> MiB 14 GiB 2.0 TiB
72.23 0.98 93 up osd.64
> > >>>> >> >> >>> > 65 hdd 7.27739
1.00000 7.3 TiB 5.2 TiB 5.2
TiB 227
> > >>>> MiB 14 GiB 2.1 TiB
71.16 0.97 100 up osd.65
> > >>>> >> >> >>> > 66 hdd 7.27739
1.00000 7.3 TiB 5.4 TiB 5.4
TiB 313
> > >>>> MiB 14 GiB 1.9 TiB
74.25 1.01 92 up osd.66
> > >>>> >> >> >>> > 67 hdd 7.27739
1.00000 7.3 TiB 5.1 TiB 5.1
TiB 584
> > >>>> MiB 14 GiB 2.1 TiB
70.63 0.96 96 up osd.67
> > >>>> >> >> >>> > 68 hdd 7.27739
1.00000 7.3 TiB 5.2 TiB 5.2
TiB 720
> > >>>> MiB 14 GiB 2.1 TiB
71.72 0.97 101 up osd.68
> > >>>> >> >> >>> > 70 hdd 7.27739
1.00000 7.3 TiB 5.1 TiB 5.1
TiB 425
> > >>>> MiB 14 GiB 2.1 TiB
70.59 0.96 97 up osd.70
> > >>>> >> >> >>> > -12 50.99052
- 51 TiB 38 TiB 37
TiB 2.1
> > >>>> GiB 97 GiB 13 TiB
73.77 1.00 - host s3db11
> > >>>> >> >> >>> > 46 hdd 7.27739
1.00000 7.3 TiB 5.6 TiB 5.6
TiB 229
> > >>>> MiB 14 GiB 1.7 TiB
77.05 1.05 97 up osd.46
> > >>>> >> >> >>> > 47 hdd 7.27739
1.00000 7.3 TiB 5.1 TiB 5.1
TiB 159
> > >>>> MiB 13 GiB 2.2 TiB
70.00 0.95 89 up osd.47
> > >>>> >> >> >>> > 48 hdd 7.27739
1.00000 7.3 TiB 5.2 TiB 5.2
TiB 279
> > >>>> MiB 14 GiB 2.1 TiB
71.82 0.97 98 up osd.48
> > >>>> >> >> >>> > 49 hdd 7.27739
1.00000 7.3 TiB 5.5 TiB 5.4
TiB 276
> > >>>> MiB 14 GiB 1.8 TiB
74.90 1.02 95 up osd.49
> > >>>> >> >> >>> > 50 hdd 7.27739
1.00000 7.3 TiB 5.2 TiB 5.2
TiB 336
> > >>>> MiB 14 GiB 2.0 TiB
72.13 0.98 93 up osd.50
> > >>>> >> >> >>> > 51 hdd 7.27739
1.00000 7.3 TiB 5.7 TiB 5.6
TiB 728
> > >>>> MiB 15 GiB 1.6 TiB
77.76 1.06 98 up osd.51
> > >>>> >> >> >>> > 72 hdd 7.32619
1.00000 7.3 TiB 5.3 TiB 5.3
TiB 147
> > >>>> MiB 13 GiB 2.0 TiB
72.75 0.99 95 up osd.72
> > >>>> >> >> >>> > -37 58.55478
- 59 TiB 44 TiB 44
TiB 4.4
> > >>>> GiB 122 GiB 15 TiB
75.20 1.02 - host s3db12
> > >>>> >> >> >>> > 19 hdd 3.68750
1.00000 3.7 TiB 2.9 TiB 2.9
TiB 454
> > >>>> MiB 8.2 GiB 780 GiB
79.35 1.08 53 up osd.19
> > >>>> >> >> >>> > 71 hdd 3.68750
1.00000 3.7 TiB 3.0 TiB 2.9
TiB 7.1
> > >>>> MiB 8.0 GiB 734 GiB
80.56 1.09 47 up osd.71
> > >>>> >> >> >>> > 75 hdd 3.68750
1.00000 3.7 TiB 2.9 TiB 2.9
TiB 439
> > >>>> MiB 8.2 GiB 777 GiB
79.43 1.08 48 up osd.75
> > >>>> >> >> >>> > 76 hdd 3.68750
1.00000 3.7 TiB 3.0 TiB 3.0
TiB 241
> > >>>> MiB 8.9 GiB 688 GiB
81.77 1.11 50 up osd.76
> > >>>> >> >> >>> > 77 hdd 14.60159
1.00000 15 TiB 11 TiB 11
TiB 880
> > >>>> MiB 30 GiB 3.6 TiB
75.44 1.02 201 up osd.77
> > >>>> >> >> >>> > 78 hdd 14.60159
1.00000 15 TiB 10 TiB 10
TiB 1015
> > >>>> MiB 28 GiB 4.2 TiB
71.26 0.97 193 up osd.78
> > >>>> >> >> >>> > 83 hdd 14.60159
1.00000 15 TiB 11 TiB 11
TiB 1.4
> > >>>> GiB 30 GiB 3.8 TiB
73.76 1.00 203 up osd.83
> > >>>> >> >> >>> > -3 58.49872
- 58 TiB 42 TiB 36
TiB 8.2
> > >>>> GiB 89 GiB 17 TiB
71.71 0.97 - host s3db2
> > >>>> >> >> >>> > 1 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 3.2
> > >>>> GiB 37 GiB 3.7 TiB
74.58 1.01 196 up osd.1
> > >>>> >> >> >>> > 3 hdd 3.63689
1.00000 3.6 TiB 2.3 TiB 1.3
TiB 566
> > >>>> MiB 0 B 1.3 TiB
64.11 0.87 50 up osd.3
> > >>>> >> >> >>> > 4 hdd 3.63689
1.00000 3.6 TiB 2.9 TiB 771
GiB 695
> > >>>> MiB 0 B 771 GiB
79.30 1.08 48 up osd.4
> > >>>> >> >> >>> > 5 hdd 3.63689
1.00000 3.6 TiB 2.4 TiB 1.2
TiB 482
> > >>>> MiB 0 B 1.2 TiB
66.51 0.90 49 up osd.5
> > >>>> >> >> >>> > 6 hdd 3.63689
1.00000 3.6 TiB 2.3 TiB 1.3
TiB 1.8
> > >>>> GiB 0 B 1.3 TiB
64.00 0.87 42 up osd.6
> > >>>> >> >> >>> > 7 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 639
> > >>>> MiB 26 GiB 4.0 TiB
72.44 0.98 192 up osd.7
> > >>>> >> >> >>> > 74 hdd 14.65039
1.00000 15 TiB 10 TiB 10
TiB 907
> > >>>> MiB 26 GiB 4.2 TiB
71.32 0.97 193 up osd.74
> > >>>> >> >> >>> > -4 58.49872
- 58 TiB 43 TiB 36
TiB 34
> > >>>> GiB 85 GiB 16 TiB
72.69 0.99 - host s3db3
> > >>>> >> >> >>> > 2 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 980
> > >>>> MiB 26 GiB 3.8 TiB
74.36 1.01 203 up osd.2
> > >>>> >> >> >>> > 9 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 8.4
> > >>>> GiB 33 GiB 3.9 TiB
73.51 1.00 186 up osd.9
> > >>>> >> >> >>> > 10 hdd 14.65039
1.00000 15 TiB 10 TiB 10
TiB 650
> > >>>> MiB 26 GiB 4.2 TiB
71.64 0.97 201 up osd.10
> > >>>> >> >> >>> > 12 hdd 3.63689
1.00000 3.6 TiB 2.3 TiB 1.3
TiB 754
> > >>>> MiB 0 B 1.3 TiB
64.17 0.87 44 up osd.12
> > >>>> >> >> >>> > 13 hdd 3.63689
1.00000 3.6 TiB 2.8 TiB 813
GiB 2.4
> > >>>> GiB 0 B 813 GiB
78.17 1.06 58 up osd.13
> > >>>> >> >> >>> > 14 hdd 3.63689
1.00000 3.6 TiB 2.9 TiB 797
GiB 19
> > >>>> GiB 0 B 797 GiB
78.60 1.07 56 up osd.14
> > >>>> >> >> >>> > 15 hdd 3.63689
1.00000 3.6 TiB 2.3 TiB 1.3
TiB 2.2
> > >>>> GiB 0 B 1.3 TiB
63.96 0.87 41 up osd.15
> > >>>> >> >> >>> > -5 58.49872
- 58 TiB 43 TiB 36
TiB 6.7
> > >>>> GiB 97 GiB 15 TiB
74.04 1.01 - host s3db4
> > >>>> >> >> >>> > 11 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 940
> > >>>> MiB 26 GiB 4.0 TiB
72.49 0.98 196 up osd.11
> > >>>> >> >> >>> > 17 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 1022
> > >>>> MiB 26 GiB 3.6 TiB
75.23 1.02 204 up osd.17
> > >>>> >> >> >>> > 18 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 945
> > >>>> MiB 45 GiB 3.8 TiB
74.16 1.01 193 up osd.18
> > >>>> >> >> >>> > 20 hdd 3.63689
1.00000 3.6 TiB 2.6 TiB 1020
GiB 596
> > >>>> MiB 0 B 1020 GiB
72.62 0.99 57 up osd.20
> > >>>> >> >> >>> > 21 hdd 3.63689
1.00000 3.6 TiB 2.6 TiB 1023
GiB 1.9
> > >>>> GiB 0 B 1023 GiB
72.54 0.98 41 up osd.21
> > >>>> >> >> >>> > 22 hdd 3.63689
1.00000 3.6 TiB 2.6 TiB 1023
GiB 797
> > >>>> MiB 0 B 1023 GiB
72.54 0.98 53 up osd.22
> > >>>> >> >> >>> > 24 hdd 3.63689
1.00000 3.6 TiB 2.9 TiB 766
GiB 618
> > >>>> MiB 0 B 766 GiB
79.42 1.08 46 up osd.24
> > >>>> >> >> >>> > -6 58.89636
- 59 TiB 43 TiB 43
TiB 3.0
> > >>>> GiB 108 GiB 16 TiB
73.40 1.00 - host s3db5
> > >>>> >> >> >>> > 0 hdd 3.73630
1.00000 3.7 TiB 2.7 TiB 2.6
TiB 92
> > >>>> MiB 7.2 GiB 1.1 TiB
71.16 0.97 45 up osd.0
> > >>>> >> >> >>> > 25 hdd 3.73630
1.00000 3.7 TiB 2.7 TiB 2.6
TiB 2.4
> > >>>> MiB 7.3 GiB 1.1 TiB
71.23 0.97 41 up osd.25
> > >>>> >> >> >>> > 26 hdd 3.73630
1.00000 3.7 TiB 2.8 TiB 2.7
TiB 181
> > >>>> MiB 7.6 GiB 935 GiB
75.57 1.03 45 up osd.26
> > >>>> >> >> >>> > 27 hdd 3.73630
1.00000 3.7 TiB 2.7 TiB 2.6
TiB 5.1
> > >>>> MiB 7.0 GiB 1.1 TiB
71.20 0.97 47 up osd.27
> > >>>> >> >> >>> > 28 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 977
> > >>>> MiB 26 GiB 3.8 TiB
73.85 1.00 197 up osd.28
> > >>>> >> >> >>> > 29 hdd 14.65039
1.00000 15 TiB 11 TiB 10
TiB 872
> > >>>> MiB 26 GiB 4.1 TiB
71.98 0.98 196 up osd.29
> > >>>> >> >> >>> > 30 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 943
> > >>>> MiB 27 GiB 3.6 TiB
75.51 1.03 202 up osd.30
> > >>>> >> >> >>> > -7 58.89636
- 59 TiB 44 TiB 43
TiB 13
> > >>>> GiB 122 GiB 15 TiB
74.97 1.02 - host s3db6
> > >>>> >> >> >>> > 32 hdd 3.73630
1.00000 3.7 TiB 2.8 TiB 2.7
TiB 27
> > >>>> MiB 7.6 GiB 940 GiB
75.42 1.02 55 up osd.32
> > >>>> >> >> >>> > 33 hdd 3.73630
1.00000 3.7 TiB 3.1 TiB 3.0
TiB 376
> > >>>> MiB 8.2 GiB 691 GiB
81.94 1.11 55 up osd.33
> > >>>> >> >> >>> > 34 hdd 3.73630
1.00000 3.7 TiB 3.1 TiB 3.0
TiB 450
> > >>>> MiB 8.5 GiB 620 GiB
83.79 1.14 54 up osd.34
> > >>>> >> >> >>> > 35 hdd 3.73630
1.00000 3.7 TiB 3.1 TiB 3.0
TiB 316
> > >>>> MiB 8.4 GiB 690 GiB
81.98 1.11 50 up osd.35
> > >>>> >> >> >>> > 36 hdd 14.65039
1.00000 15 TiB 11 TiB 10
TiB 489
> > >>>> MiB 25 GiB 4.1 TiB
71.69 0.97 208 up osd.36
> > >>>> >> >> >>> > 37 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 11
> > >>>> GiB 38 GiB 4.0 TiB
72.41 0.98 195 up osd.37
> > >>>> >> >> >>> > 38 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 1.1
> > >>>> GiB 26 GiB 3.7 TiB
74.88 1.02 204 up osd.38
> > >>>> >> >> >>> > -8 58.89636
- 59 TiB 44 TiB 43
TiB 3.8
> > >>>> GiB 111 GiB 15 TiB
74.16 1.01 - host s3db7
> > >>>> >> >> >>> > 39 hdd 3.73630
1.00000 3.7 TiB 2.8 TiB 2.7
TiB 19
> > >>>> MiB 7.5 GiB 936 GiB
75.54 1.03 39 up osd.39
> > >>>> >> >> >>> > 40 hdd 3.73630
1.00000 3.7 TiB 2.6 TiB 2.5
TiB 144
> > >>>> MiB 7.1 GiB 1.1 TiB
69.87 0.95 39 up osd.40
> > >>>> >> >> >>> > 41 hdd 3.73630
1.00000 3.7 TiB 2.7 TiB 2.7
TiB 219
> > >>>> MiB 7.6 GiB 1011 GiB
73.57 1.00 55 up osd.41
> > >>>> >> >> >>> > 42 hdd 3.73630
1.00000 3.7 TiB 2.6 TiB 2.5
TiB 593
> > >>>> MiB 7.1 GiB 1.1 TiB
70.02 0.95 47 up osd.42
> > >>>> >> >> >>> > 43 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 500
> > >>>> MiB 27 GiB 3.7 TiB
74.67 1.01 204 up osd.43
> > >>>> >> >> >>> > 44 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 1.1
> > >>>> GiB 27 GiB 3.7 TiB
74.62 1.01 193 up osd.44
> > >>>> >> >> >>> > 45 hdd 14.65039
1.00000 15 TiB 11 TiB 11
TiB 1.2
> > >>>> GiB 29 GiB 3.6 TiB
75.16 1.02 204 up osd.45
> > >>>> >> >> >>> > -9 51.28331
- 51 TiB 39 TiB 39
TiB 4.9
> > >>>> GiB 107 GiB 12 TiB
76.50 1.04 - host s3db8
> > >>>> >> >> >>> > 8 hdd 7.32619
1.00000 7.3 TiB 5.6 TiB 5.5
TiB 474
> > >>>> MiB 14 GiB 1.7 TiB
76.37 1.04 98 up osd.8
> > >>>> >> >> >>> > 16 hdd 7.32619
1.00000 7.3 TiB 5.7 TiB 5.7
TiB 783
> > >>>> MiB 15 GiB 1.6 TiB
78.39 1.06 100 up osd.16
> > >>>> >> >> >>> > 31 hdd 7.32619
1.00000 7.3 TiB 5.7 TiB 5.6
TiB 441
> > >>>> MiB 14 GiB 1.6 TiB
77.70 1.05 91 up osd.31
> > >>>> >> >> >>> > 52 hdd 7.32619
1.00000 7.3 TiB 5.6 TiB 5.5
TiB 939
> > >>>> MiB 14 GiB 1.7 TiB
76.29 1.04 102 up osd.52
> > >>>> >> >> >>> > 53 hdd 7.32619
1.00000 7.3 TiB 5.4 TiB 5.4
TiB 848
> > >>>> MiB 18 GiB 1.9 TiB
74.30 1.01 98 up osd.53
> > >>>> >> >> >>> > 54 hdd 7.32619
1.00000 7.3 TiB 5.6 TiB 5.6
TiB 1.0
> > >>>> GiB 16 GiB 1.7 TiB
76.99 1.05 106 up osd.54
> > >>>> >> >> >>> > 55 hdd 7.32619
1.00000 7.3 TiB 5.5 TiB 5.5
TiB 460
> > >>>> MiB 15 GiB 1.8 TiB
75.46 1.02 105 up osd.55
> > >>>> >> >> >>> > -10 51.28331
- 51 TiB 37 TiB 37
TiB 3.8
> > >>>> GiB 96 GiB 14 TiB
72.77 0.99 - host s3db9
> > >>>> >> >> >>> > 56 hdd 7.32619
1.00000 7.3 TiB 5.2 TiB 5.2
TiB 846
> > >>>> MiB 13 GiB 2.1 TiB
71.16 0.97 104 up osd.56
> > >>>> >> >> >>> > 57 hdd 7.32619
1.00000 7.3 TiB 5.6 TiB 5.6
TiB 513
> > >>>> MiB 15 GiB 1.7 TiB
76.53 1.04 96 up osd.57
> > >>>> >> >> >>> > 58 hdd 7.32619
1.00000 7.3 TiB 5.2 TiB 5.2
TiB 604
> > >>>> MiB 13 GiB 2.1 TiB
71.23 0.97 98 up osd.58
> > >>>> >> >> >>> > 59 hdd 7.32619
1.00000 7.3 TiB 5.1 TiB 5.1
TiB 414
> > >>>> MiB 13 GiB 2.2 TiB
70.03 0.95 88 up osd.59
> > >>>> >> >> >>> > 60 hdd 7.32619
1.00000 7.3 TiB 5.5 TiB 5.5
TiB 227
> > >>>> MiB 14 GiB 1.8 TiB
75.54 1.03 97 up osd.60
> > >>>> >> >> >>> > 61 hdd 7.32619
1.00000 7.3 TiB 5.1 TiB 5.1
TiB 456
> > >>>> MiB 13 GiB 2.2 TiB
70.01 0.95 95 up osd.61
> > >>>> >> >> >>> > 62 hdd 7.32619
1.00000 7.3 TiB 5.5 TiB 5.4
TiB 843
> > >>>> MiB 14 GiB 1.8 TiB
74.93 1.02 110 up osd.62
> > >>>> >> >> >>> >
TOTAL 674 TiB 496 TiB 468
TiB 97
> > >>>> GiB 1.2 TiB 177 TiB
73.67
> > >>>> >> >> >>> > MIN/MAX VAR: 0.87/1.14
STDDEV: 4.22
> > >>>> >> >> >>> >
> > >>>> >> >> >>> > Am Mo., 15. März 2021 um
15:02 Uhr schrieb Dan van
der Ster
> > >>>>
<dan(a)vanderster.com>om>:
> > >>>> >> >> >>> >>
> > >>>> >> >> >>> >> OK thanks. Indeed
"prepared 0/10 changes" means it
thinks
> > >>>> things are balanced.
> > >>>> >> >> >>> >> Could you again share
the full ceph osd df tree?
> > >>>> >> >> >>> >>
> > >>>> >> >> >>> >> On Mon, Mar 15, 2021 at
2:54 PM Boris Behrens <
> > >>>> bb(a)kervyn.de> wrote:
> > >>>> >> >> >>> >> >
> > >>>> >> >> >>> >> > Hi Dan,
> > >>>> >> >> >>> >> >
> > >>>> >> >> >>> >> > I've set the
autoscaler to warn, but it actually
does
> > >>>> not warn for now. So
not touching it for now.
> > >>>> >> >> >>> >> >
> > >>>> >> >> >>> >> > this is what the
log says in minute intervals:
> > >>>> >> >> >>> >> > 2021-03-15
13:51:00.970 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/active
> > >>>> >> >> >>> >> > 2021-03-15
13:51:00.970 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/sleep_interval
> > >>>> >> >> >>> >> > 2021-03-15
13:51:00.970 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/begin_time
> > >>>> >> >> >>> >> > 2021-03-15
13:51:00.970 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/end_time
> > >>>> >> >> >>> >> > 2021-03-15
13:51:00.970 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/begin_weekday
> > >>>> >> >> >>> >> > 2021-03-15
13:51:00.970 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/end_weekday
> > >>>> >> >> >>> >> > 2021-03-15
13:51:00.971 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/pool_ids
> > >>>> >> >> >>> >> > 2021-03-15
13:51:01.203 7f307d5fd700 4
mgr[balancer]
> > >>>> Optimize plan
auto_2021-03-15_13:51:00
> > >>>> >> >> >>> >> > 2021-03-15
13:51:01.203 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/mode
> > >>>> >> >> >>> >> > 2021-03-15
13:51:01.203 7f307d5fd700 4
mgr[balancer]
> > >>>> Mode upmap, max
misplaced 0.050000
> > >>>> >> >> >>> >> > 2021-03-15
13:51:01.203 7f307d5fd700 4
mgr[balancer]
> > >>>> do_upmap
> > >>>> >> >> >>> >> > 2021-03-15
13:51:01.203 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/upmap_max_iterations
> > >>>> >> >> >>> >> > 2021-03-15
13:51:01.203 7f307d5fd700 4 mgr
get_config
> > >>>> get_config key:
mgr/balancer/upmap_max_deviation
> > >>>> >> >> >>> >> > 2021-03-15
13:51:01.203 7f307d5fd700 4
mgr[balancer]
> > >>>> pools
['eu-msg-1.rgw.data.root', 'eu-msg-1.rgw.buckets.non-ec',
> > >>>> 'eu-central-1.rgw.users.keys',
'eu-central-1.rgw.gc',
> > >>>> 'eu-central-1.rgw.buckets.data',
'eu-central-1.rgw.users.email',
> > >>>> 'eu-msg-1.rgw.gc', 'eu-central-1.rgw.usage',
'eu-msg-1.rgw.users.keys',
> > >>>>
'eu-central-1.rgw.buckets.index', 'rbd', 'eu-msg-1.rgw.log',
> > >>>> 'whitespace-again-2021-03-10_2',
'eu-msg-1.rgw.buckets.index',
> > >>>> 'eu-msg-1.rgw.meta', 'eu-central-1.rgw.log',
'default.rgw.gc',
> > >>>> 'eu-central-1.rgw.buckets.non-ec',
'eu-msg-1.rgw.usage',
> > >>>> 'whitespace-again-2021-03-10',
'fra-1.rgw.meta',
> > >>>> 'eu-central-1.rgw.users.uid',
'eu-msg-1.rgw.users.email',
> > >>>> 'fra-1.rgw.control', 'eu-msg-1.rgw.users.uid',
'eu-msg-1.rgw.control',
> > >>>> '.rgw.root',
'eu-msg-1.rgw.buckets.data', 'default.rgw.control',
> > >>>> 'fra-1.rgw.log', 'default.rgw.data.root',
'whitespace-again-2021-03-10_3',
> > >>>>
'default.rgw.log', 'eu-central-1.rgw.meta',
'eu-central-1.rgw.data.root',
> > >>>>
'default.rgw.users.uid', 'eu-central-1.rgw.control']
> > >>>> >> >> >>> >> > 2021-03-15
13:51:01.224 7f307d5fd700 4
mgr[balancer]
> > >>>> prepared 0/10 changes
> > >>>> >> >> >>> >> >
> > >>>> >> >> >>> >> > Am Mo., 15. März
2021 um 14:15 Uhr schrieb Dan
van der
> > >>>> Ster
<dan(a)vanderster.com>om>:
> > >>>> >> >> >>> >> >>
> > >>>> >> >> >>> >> >> I suggest to
just disable the autoscaler until
your
> > >>>> balancing is
understood.
> > >>>> >> >> >>> >> >>
> > >>>> >> >> >>> >> >> What does your
active mgr log say (with
debug_mgr 4/5),
> > >>>> grep balancer
> > >>>> >> >> >>> >> >>
/var/log/ceph/ceph-mgr.*.log
> > >>>> >> >> >>> >> >>
> > >>>> >> >> >>> >> >> -- Dan
> > >>>> >> >> >>> >> >>
> > >>>> >> >> >>> >> >> On Mon, Mar
15, 2021 at 1:47 PM Boris Behrens <
> > >>>> bb(a)kervyn.de> wrote:
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> > Hi,
> > >>>> >> >> >>> >> >> > this
unfortunally did not solve my problem. I
still
> > >>>> have some OSDs that
fill up to 85%
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> > According
to the logging, the autoscaler might
want
> > >>>> to add more PGs to one
Bucken and reduce almost all other
buckets to 32.
> > >>>> >> >>
>>> >> >> > 2021-03-15 12:19:58.825 7f307f601700 4
> > >>>> mgr[pg_autoscaler] Pool 'eu-central-1.rgw.buckets.data'
root_id
-1 using
> > >>>> 0.705080476146 of
space, bias 1.0, pg target 1974.22533321
quantized to
> > >>>> 2048 (current 1024)
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> > Why the
balancing does not happen is still
nebulous
> > >>>> to me.
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> > Am Sa.,
13. März 2021 um 16:37 Uhr schrieb Dan
van
> > >>>> der Ster
<dan(a)vanderster.com>om>:
> > >>>> >> >> >>> >> >> >>
> > >>>> >> >> >>> >> >> >> OK
> > >>>> >> >> >>> >> >> >> Btw,
you might need to fail to a new mgr...
I'm not
> > >>>> sure if the current
active will read that new config.
> > >>>> >> >> >>> >> >> >>
> > >>>> >> >> >>> >> >> >> ..
dan
> > >>>> >> >> >>> >> >> >>
> > >>>> >> >> >>> >> >> >>
> > >>>> >> >> >>> >> >> >> On
Sat, Mar 13, 2021, 4:36 PM Boris Behrens <
> > >>>> bb(a)kervyn.de> wrote:
> > >>>> >> >> >>> >> >> >>>
> > >>>> >> >> >>> >> >> >>>
Hi,
> > >>>> >> >> >>> >> >> >>>
> > >>>> >> >> >>> >> >> >>>
ok thanks. I just changed the value and
rewighted
> > >>>> everything back to 1.
Now I let it sync the weekend and check
how it will
> > >>>> be on monday.
> > >>>> >> >> >>> >> >> >>>
We tried to have the systems total storage
balanced
> > >>>> as possible. New
systems will be with 8TB disks but for the
exiting ones we
> > >>>> added 16TB to offset
the 4TB disks and we needed a lot of
storage fast,
> > >>>> because of a DC move.
If you have any recommendations I would be
happy to
> > >>>> hear them.
> > >>>> >> >> >>> >> >> >>>
> > >>>> >> >> >>> >> >> >>>
Cheers
> > >>>> >> >> >>> >> >> >>>
Boris
> > >>>> >> >> >>> >> >> >>>
> > >>>> >> >> >>> >> >> >>>
Am Sa., 13. März 2021 um 16:20 Uhr schrieb
Dan van
> > >>>> der Ster
<dan(a)vanderster.com>om>:
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>> Thanks.
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>> Decreasing the max deviation to 2 or 1
should help
> > >>>> in your case. This
option controls when the balancer stops
trying to move
> > >>>> PGs around -- by
default it stops when the deviation from the
mean is 5.
> > >>>> Yes this is too large
IMO -- all of our clusters have this set
to 1.
> > >>>> >> >>
>>> >> >> >>>>
> > >>>> >> >> >>> >> >>
>>>> And given that you have some OSDs with more
than
> > >>>> 200 PGs, you definitely
shouldn't increase the num PGs.
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>> But anyway with your mixed device sizes it
might
> > >>>> be challenging to make
a perfectly uniform distribution. Give it
a try with
> > >>>> 1 though, and let us
know how it goes.
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>> .. Dan
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>>
> > >>>> >> >> >>> >> >>
>>>> On Sat, Mar 13, 2021, 4:11 PM Boris Behrens
<
> > >>>> bb(a)kervyn.de>
wrote:
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>> Hi Dan,
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>> upmap_max_deviation is default (5) in our
> > >>>> cluster. Is 1 the recommended deviation?
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>> I added the whole ceph osd df tree, (I
need to
> > >>>> remove some OSDs and
readd them as bluestore with SSD, so 69, 73
and 82 are
> > >>>> a bit off now. I also
reweighted to try to get the %USE
mitigated).
> > >>>> >> >>
>>> >> >> >>>>>
> > >>>> >> >> >>> >> >>
>>>>> I will increase the mgr debugging to see
what is
> > >>>> the problem.
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>> [root@s3db1 ~]# ceph osd df tree
> > >>>> >> >> >>> >> >>
>>>>> ID CLASS WEIGHT REWEIGHT SIZE RAW
USE
> > >>>> DATA OMAP META
AVAIL %USE VAR PGS STATUS TYPE NAME
> > >>>> >> >> >>> >> >>
>>>>> -1 673.54224 - 659 TiB 491
TiB 464
> > >>>> TiB 96 GiB 1.2 TiB 168
TiB 74.57 1.00 - root default
> > >>>> >> >> >>> >> >>
>>>>> -2 58.30331 - 44 TiB 22
TiB 17
> > >>>> TiB 5.7 GiB 38 GiB 22
TiB 49.82 0.67 - host s3db1
> > >>>> >> >> >>> >> >>
>>>>> 23 hdd 14.65039 1.00000 15 TiB 1.8
TiB 1.7
> > >>>> TiB 156 MiB 4.4 GiB 13
TiB 12.50 0.17 101 up osd.23
> > >>>> >> >> >>> >> >>
>>>>> 69 hdd 14.55269 0 0 B 0
B
> > >>>> 0 B 0 B 0 B
0 B 0 0 11 up osd.69
> > >>>> >> >> >>> >> >>
>>>>> 73 hdd 14.55269 1.00000 15 TiB 10
TiB 10
> > >>>> TiB 6.1 MiB 33 GiB 4.2
TiB 71.15 0.95 107 up osd.73
> > >>>> >> >> >>> >> >>
>>>>> 79 hdd 3.63689 1.00000 3.6 TiB 2.9
TiB 747
> > >>>> GiB 2.0 GiB 0 B 747
GiB 79.94 1.07 52 up osd.79
> > >>>> >> >> >>> >> >>
>>>>> 80 hdd 3.63689 1.00000 3.6 TiB 2.6
TiB 1.0
> > >>>> TiB 1.9 GiB 0 B 1.0
TiB 71.61 0.96 58 up osd.80
> > >>>> >> >> >>> >> >>
>>>>> 81 hdd 3.63689 1.00000 3.6 TiB 2.2
TiB 1.5
> > >>>> TiB 1.1 GiB 0 B 1.5
TiB 60.07 0.81 55 up osd.81
> > >>>> >> >> >>> >> >>
>>>>> 82 hdd 3.63689 1.00000 3.6 TiB 1.9
TiB 1.7
> > >>>> TiB 536 MiB 0 B 1.7
TiB 52.68 0.71 30 up osd.82
> > >>>> >> >> >>> >> >>
>>>>> -11 50.94173 - 51 TiB 38
TiB 38
> > >>>> TiB 3.7 GiB 100 GiB 13
TiB 74.69 1.00 - host s3db10
> > >>>> >> >> >>> >> >>
>>>>> 63 hdd 7.27739 1.00000 7.3 TiB 5.5
TiB 5.5
> > >>>> TiB 616 MiB 14 GiB 1.7
TiB 76.04 1.02 92 up osd.63
> > >>>> >> >> >>> >> >>
>>>>> 64 hdd 7.27739 1.00000 7.3 TiB 5.5
TiB 5.5
> > >>>> TiB 820 MiB 15 GiB 1.8
TiB 75.54 1.01 101 up osd.64
> > >>>> >> >> >>> >> >>
>>>>> 65 hdd 7.27739 1.00000 7.3 TiB 5.3
TiB 5.3
> > >>>> TiB 109 MiB 14 GiB 2.0
TiB 73.17 0.98 105 up osd.65
> > >>>> >> >> >>> >> >>
>>>>> 66 hdd 7.27739 1.00000 7.3 TiB 5.8
TiB 5.8
> > >>>> TiB 423 MiB 15 GiB 1.4
TiB 80.38 1.08 98 up osd.66
> > >>>> >> >> >>> >> >>
>>>>> 67 hdd 7.27739 1.00000 7.3 TiB 5.1
TiB 5.1
> > >>>> TiB 572 MiB 14 GiB 2.2
TiB 70.10 0.94 100 up osd.67
> > >>>> >> >> >>> >> >>
>>>>> 68 hdd 7.27739 1.00000 7.3 TiB 5.3
TiB 5.3
> > >>>> TiB 630 MiB 13 GiB 2.0
TiB 72.88 0.98 107 up osd.68
> > >>>> >> >> >>> >> >>
>>>>> 70 hdd 7.27739 1.00000 7.3 TiB 5.4
TiB 5.4
> > >>>> TiB 648 MiB 14 GiB 1.8
TiB 74.73 1.00 102 up osd.70
> > >>>> >> >> >>> >> >>
>>>>> -12 50.99052 - 51 TiB 39
TiB 39
> > >>>> TiB 2.9 GiB 99 GiB 12
TiB 77.24 1.04 - host s3db11
> > >>>> >> >> >>> >> >>
>>>>> 46 hdd 7.27739 1.00000 7.3 TiB 5.7
TiB 5.7
> > >>>> TiB 102 MiB 15 GiB 1.5
TiB 78.91 1.06 97 up osd.46
> > >>>> >> >> >>> >> >>
>>>>> 47 hdd 7.27739 1.00000 7.3 TiB 5.2
TiB 5.2
> > >>>> TiB 61 MiB 13 GiB 2.1
TiB 71.47 0.96 96 up osd.47
> > >>>> >> >> >>> >> >>
>>>>> 48 hdd 7.27739 1.00000 7.3 TiB 6.1
TiB 6.1
> > >>>> TiB 853 MiB 15 GiB 1.2
TiB 83.46 1.12 109 up osd.48
> > >>>> >> >> >>> >> >>
>>>>> 49 hdd 7.27739 1.00000 7.3 TiB 5.7
TiB 5.7
> > >>>> TiB 708 MiB 15 GiB 1.5
TiB 78.96 1.06 98 up osd.49
> > >>>> >> >> >>> >> >>
>>>>> 50 hdd 7.27739 1.00000 7.3 TiB 5.9
TiB 5.8
> > >>>> TiB 472 MiB 15 GiB 1.4
TiB 80.40 1.08 102 up osd.50
> > >>>> >> >> >>> >> >>
>>>>> 51 hdd 7.27739 1.00000 7.3 TiB 5.9
TiB 5.9
> > >>>> TiB 729 MiB 15 GiB 1.3
TiB 81.70 1.10 110 up osd.51
> > >>>> >> >> >>> >> >>
>>>>> 72 hdd 7.32619 1.00000 7.3 TiB 4.8
TiB 4.8
> > >>>> TiB 91 MiB 12 GiB 2.5
TiB 65.82 0.88 89 up osd.72
> > >>>> >> >> >>> >> >>
>>>>> -37 58.55478 - 59 TiB 46
TiB 46
> > >>>> TiB 5.0 GiB 124 GiB 12
TiB 79.04 1.06 - host s3db12
> > >>>> >> >> >>> >> >>
>>>>> 19 hdd 3.68750 1.00000 3.7 TiB 3.1
TiB 3.1
> > >>>> TiB 462 MiB 8.2 GiB 559
GiB 85.18 1.14 55 up osd.19
> > >>>> >> >> >>> >> >>
>>>>> 71 hdd 3.68750 1.00000 3.7 TiB 2.9
TiB 2.8
> > >>>> TiB 3.9 MiB 7.8 GiB 825
GiB 78.14 1.05 50 up osd.71
> > >>>> >> >> >>> >> >>
>>>>> 75 hdd 3.68750 1.00000 3.7 TiB 3.1
TiB 3.1
> > >>>> TiB 576 MiB 8.3 GiB 555
GiB 85.29 1.14 57 up osd.75
> > >>>> >> >> >>> >> >>
>>>>> 76 hdd 3.68750 1.00000 3.7 TiB 3.2
TiB 3.1
> > >>>> TiB 239 MiB 9.3 GiB 501
GiB 86.73 1.16 50 up osd.76
> > >>>> >> >> >>> >> >>
>>>>> 77 hdd 14.60159 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 880 MiB 30 GiB 3.6
TiB 75.57 1.01 202 up osd.77
> > >>>> >> >> >>> >> >>
>>>>> 78 hdd 14.60159 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 1.0 GiB 30 GiB 3.4
TiB 76.65 1.03 196 up osd.78
> > >>>> >> >> >>> >> >>
>>>>> 83 hdd 14.60159 1.00000 15 TiB 12
TiB 12
> > >>>> TiB 1.8 GiB 31 GiB 2.9
TiB 80.04 1.07 223 up osd.83
> > >>>> >> >> >>> >> >>
>>>>> -3 58.49872 - 58 TiB 43
TiB 38
> > >>>> TiB 8.1 GiB 91 GiB 16
TiB 73.15 0.98 - host s3db2
> > >>>> >> >> >>> >> >>
>>>>> 1 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 3.1 GiB 38 GiB 3.6
TiB 75.52 1.01 194 up osd.1
> > >>>> >> >> >>> >> >>
>>>>> 3 hdd 3.63689 1.00000 3.6 TiB 2.2
TiB 1.4
> > >>>> TiB 418 MiB 0 B 1.4
TiB 60.94 0.82 52 up osd.3
> > >>>> >> >> >>> >> >>
>>>>> 4 hdd 3.63689 0.89999 3.6 TiB 3.2
TiB 401
> > >>>> GiB 845 MiB 0 B 401
GiB 89.23 1.20 53 up osd.4
> > >>>> >> >> >>> >> >>
>>>>> 5 hdd 3.63689 1.00000 3.6 TiB 2.3
TiB 1.3
> > >>>> TiB 437 MiB 0 B 1.3
TiB 62.88 0.84 51 up osd.5
> > >>>> >> >> >>> >> >>
>>>>> 6 hdd 3.63689 1.00000 3.6 TiB 2.0
TiB 1.7
> > >>>> TiB 1.8 GiB 0 B 1.7
TiB 54.51 0.73 47 up osd.6
> > >>>> >> >> >>> >> >>
>>>>> 7 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 493 MiB 26 GiB 3.8
TiB 73.90 0.99 185 up osd.7
> > >>>> >> >> >>> >> >>
>>>>> 74 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 1.1 GiB 27 GiB 3.5
TiB 76.27 1.02 208 up osd.74
> > >>>> >> >> >>> >> >>
>>>>> -4 58.49872 - 58 TiB 43
TiB 37
> > >>>> TiB 33 GiB 86 GiB 15
TiB 74.05 0.99 - host s3db3
> > >>>> >> >> >>> >> >>
>>>>> 2 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 850 MiB 26 GiB 4.0
TiB 72.78 0.98 203 up osd.2
> > >>>> >> >> >>> >> >>
>>>>> 9 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 8.3 GiB 33 GiB 3.6
TiB 75.62 1.01 189 up osd.9
> > >>>> >> >> >>> >> >>
>>>>> 10 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 663 MiB 28 GiB 3.5
TiB 76.34 1.02 211 up osd.10
> > >>>> >> >> >>> >> >>
>>>>> 12 hdd 3.63689 1.00000 3.6 TiB 2.4
TiB 1.2
> > >>>> TiB 633 MiB 0 B 1.2
TiB 66.22 0.89 44 up osd.12
> > >>>> >> >> >>> >> >>
>>>>> 13 hdd 3.63689 1.00000 3.6 TiB 2.9
TiB 720
> > >>>> GiB 2.3 GiB 0 B 720
GiB 80.66 1.08 66 up osd.13
> > >>>> >> >> >>> >> >>
>>>>> 14 hdd 3.63689 1.00000 3.6 TiB 3.1
TiB 552
> > >>>> GiB 18 GiB 0 B 552
GiB 85.18 1.14 60 up osd.14
> > >>>> >> >> >>> >> >>
>>>>> 15 hdd 3.63689 1.00000 3.6 TiB 2.0
TiB 1.7
> > >>>> TiB 2.1 GiB 0 B 1.7
TiB 53.72 0.72 44 up osd.15
> > >>>> >> >> >>> >> >>
>>>>> -5 58.49872 - 58 TiB 45
TiB 37
> > >>>> TiB 7.2 GiB 99 GiB 14
TiB 76.37 1.02 - host s3db4
> > >>>> >> >> >>> >> >>
>>>>> 11 hdd 14.65039 1.00000 15 TiB 12
TiB 12
> > >>>> TiB 897 MiB 28 GiB 2.8
TiB 81.15 1.09 205 up osd.11
> > >>>> >> >> >>> >> >>
>>>>> 17 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 1.2 GiB 27 GiB 3.6
TiB 75.38 1.01 211 up osd.17
> > >>>> >> >> >>> >> >>
>>>>> 18 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 965 MiB 44 GiB 4.0
TiB 72.86 0.98 188 up osd.18
> > >>>> >> >> >>> >> >>
>>>>> 20 hdd 3.63689 1.00000 3.6 TiB 2.9
TiB 796
> > >>>> GiB 529 MiB 0 B 796
GiB 78.63 1.05 66 up osd.20
> > >>>> >> >> >>> >> >>
>>>>> 21 hdd 3.63689 1.00000 3.6 TiB 2.6
TiB 1.1
> > >>>> TiB 2.1 GiB 0 B 1.1
TiB 70.32 0.94 47 up osd.21
> > >>>> >> >> >>> >> >>
>>>>> 22 hdd 3.63689 1.00000 3.6 TiB 2.9
TiB 802
> > >>>> GiB 882 MiB 0 B 802
GiB 78.47 1.05 58 up osd.22
> > >>>> >> >> >>> >> >>
>>>>> 24 hdd 3.63689 1.00000 3.6 TiB 2.8
TiB 856
> > >>>> GiB 645 MiB 0 B 856
GiB 77.01 1.03 47 up osd.24
> > >>>> >> >> >>> >> >>
>>>>> -6 58.89636 - 59 TiB 44
TiB 44
> > >>>> TiB 2.4 GiB 111 GiB 15
TiB 75.22 1.01 - host s3db5
> > >>>> >> >> >>> >> >>
>>>>> 0 hdd 3.73630 1.00000 3.7 TiB 2.4
TiB 2.3
> > >>>> TiB 70 MiB 6.6 GiB 1.3
TiB 65.00 0.87 48 up osd.0
> > >>>> >> >> >>> >> >>
>>>>> 25 hdd 3.73630 1.00000 3.7 TiB 2.4
TiB 2.3
> > >>>> TiB 5.3 MiB 6.6 GiB 1.4
TiB 63.86 0.86 41 up osd.25
> > >>>> >> >> >>> >> >>
>>>>> 26 hdd 3.73630 1.00000 3.7 TiB 2.9
TiB 2.8
> > >>>> TiB 181 MiB 7.6 GiB 862
GiB 77.47 1.04 48 up osd.26
> > >>>> >> >> >>> >> >>
>>>>> 27 hdd 3.73630 1.00000 3.7 TiB 2.3
TiB 2.2
> > >>>> TiB 7.0 MiB 6.1 GiB 1.5
TiB 61.00 0.82 48 up osd.27
> > >>>> >> >> >>> >> >>
>>>>> 28 hdd 14.65039 1.00000 15 TiB 12
TiB 12
> > >>>> TiB 937 MiB 30 GiB 2.8
TiB 81.19 1.09 203 up osd.28
> > >>>> >> >> >>> >> >>
>>>>> 29 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 536 MiB 26 GiB 3.8
TiB 73.95 0.99 200 up osd.29
> > >>>> >> >> >>> >> >>
>>>>> 30 hdd 14.65039 1.00000 15 TiB 12
TiB 11
> > >>>> TiB 744 MiB 28 GiB 3.1
TiB 79.07 1.06 207 up osd.30
> > >>>> >> >> >>> >> >>
>>>>> -7 58.89636 - 59 TiB 44
TiB 44
> > >>>> TiB 14 GiB 122 GiB 14
TiB 75.41 1.01 - host s3db6
> > >>>> >> >> >>> >> >>
>>>>> 32 hdd 3.73630 1.00000 3.7 TiB 3.1
TiB 3.0
> > >>>> TiB 16 MiB 8.2 GiB 622
GiB 83.74 1.12 65 up osd.32
> > >>>> >> >> >>> >> >>
>>>>> 33 hdd 3.73630 0.79999 3.7 TiB 3.0
TiB 2.9
> > >>>> TiB 14 MiB 8.1 GiB 740
GiB 80.67 1.08 52 up osd.33
> > >>>> >> >> >>> >> >>
>>>>> 34 hdd 3.73630 0.79999 3.7 TiB 2.9
TiB 2.8
> > >>>> TiB 449 MiB 7.7 GiB 877
GiB 77.08 1.03 52 up osd.34
> > >>>> >> >> >>> >> >>
>>>>> 35 hdd 3.73630 0.79999 3.7 TiB 2.3
TiB 2.2
> > >>>> TiB 133 MiB 7.0 GiB 1.4
TiB 62.18 0.83 42 up osd.35
> > >>>> >> >> >>> >> >>
>>>>> 36 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 544 MiB 26 GiB 4.0
TiB 72.98 0.98 220 up osd.36
> > >>>> >> >> >>> >> >>
>>>>> 37 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 11 GiB 38 GiB 3.6
TiB 75.30 1.01 200 up osd.37
> > >>>> >> >> >>> >> >>
>>>>> 38 hdd 14.65039 1.00000 15 TiB 11
TiB 11
> > >>>> TiB 1.2 GiB 28 GiB 3.3
TiB 77.43 1.04 217 up osd.38
> > >>>> >> >> >>> >> >>
>>>>> -8 58.89636 - 59 TiB 47
TiB 46
> > >>>> TiB 3.9 GiB 116 GiB 12
TiB 78.98 1.06 - host s3db7
> > >>>> >> >> >>> >> >>
>>>>> 39 hdd 3.73630 1.00000 3.7 TiB 3.2
TiB 3.2
> > >>>> TiB 19 MiB 8.5 GiB 499
GiB 86.96 1.17 43 up osd.39
> > >>>> >> >> >>> >> >>
>>>>> 40 hdd 3.73630 1.00000 3.7 TiB 2.6
TiB 2.5
> > >>>> TiB 144 MiB 7.0 GiB 1.2
TiB 68.33 0.92 39 up osd.40
> > >>>> >> >> >>> >> >>
>>>>> 41 hdd 3.73630 1.00000 3.7 TiB 3.0
TiB 2.9
> > >>>> TiB 218 MiB 7.9 GiB 732
GiB 80.86 1.08 64 up osd.41
> > >>>> >> >> >>> >> >>
>>>>> 42 hdd 3.73630 1.00000 3.7 TiB 2.5
TiB 2.4
> > >>>> TiB 594 MiB 7.0 GiB 1.2
TiB 67.97 0.91 50 up osd.42
> > >>>> >> >> >>> >> >>
>>>>> 43 hdd 14.65039 1.00000 15 TiB 12
TiB 12
> > >>>> TiB 564 MiB 28 GiB 2.9
TiB 80.32 1.08 213 up osd.43
> > >>>> >> >> >>> >> >>
>>>>> 44 hdd 14.65039 1.00000 15 TiB 12
TiB 11
> > >>>> TiB 1.3 GiB 28 GiB 3.1
TiB 78.59 1.05 198 up osd.44
> > >>>> >> >> >>> >> >>
>>>>> 45 hdd 14.65039 1.00000 15 TiB 12
TiB 12
> > >>>> TiB 1.2 GiB 30 GiB 2.8
TiB 81.05 1.09 214 up osd.45
> > >>>> >> >> >>> >> >>
>>>>> -9 51.28331 - 51 TiB 41
TiB 41
> > >>>> TiB 4.9 GiB 108 GiB 10
TiB 79.75 1.07 - host s3db8
> > >>>> >> >> >>> >> >>
>>>>> 8 hdd 7.32619 1.00000 7.3 TiB 5.8
TiB 5.8
> > >>>> TiB 472 MiB 15 GiB 1.5
TiB 79.68 1.07 99 up osd.8
> > >>>> >> >> >>> >> >>
>>>>> 16 hdd 7.32619 1.00000 7.3 TiB 5.9
TiB 5.8
> > >>>> TiB 785 MiB 15 GiB 1.4
TiB 80.25 1.08 97 up osd.16
> > >>>> >> >> >>> >> >>
>>>>> 31 hdd 7.32619 1.00000 7.3 TiB 5.5
TiB 5.5
> > >>>> TiB 438 MiB 14 GiB 1.8
TiB 75.36 1.01 87 up osd.31
> > >>>> >> >> >>> >> >>
>>>>> 52 hdd 7.32619 1.00000 7.3 TiB 5.7
TiB 5.7
> > >>>> TiB 844 MiB 15 GiB 1.6
TiB 78.19 1.05 113 up osd.52
> > >>>> >> >> >>> >> >>
>>>>> 53 hdd 7.32619 1.00000 7.3 TiB 6.2
TiB 6.1
> > >>>> TiB 792 MiB 18 GiB 1.1
TiB 84.46 1.13 109 up osd.53
> > >>>> >> >> >>> >> >>
>>>>> 54 hdd 7.32619 1.00000 7.3 TiB 5.6
TiB 5.6
> > >>>> TiB 959 MiB 15 GiB 1.7
TiB 76.73 1.03 115 up osd.54
> > >>>> >> >> >>> >> >>
>>>>> 55 hdd 7.32619 1.00000 7.3 TiB 6.1
TiB 6.1
> > >>>> TiB 699 MiB 16 GiB 1.2
TiB 83.56 1.12 122 up osd.55
> > >>>> >> >> >>> >> >>
>>>>> -10 51.28331 - 51 TiB 39
TiB 39
> > >>>> TiB 4.7 GiB 100 GiB 12
TiB 76.05 1.02 - host s3db9
> > >>>> >> >> >>> >> >>
>>>>> 56 hdd 7.32619 1.00000 7.3 TiB 5.2
TiB 5.2
> > >>>> TiB 840 MiB 13 GiB 2.1
TiB 71.06 0.95 105 up osd.56
> > >>>> >> >> >>> >> >>
>>>>> 57 hdd 7.32619 1.00000 7.3 TiB 6.1
TiB 6.0
> > >>>> TiB 1.0 GiB 16 GiB 1.2
TiB 83.17 1.12 102 up osd.57
> > >>>> >> >> >>> >> >>
>>>>> 58 hdd 7.32619 1.00000 7.3 TiB 6.0
TiB 5.9
> > >>>> TiB 43 MiB 15 GiB 1.4
TiB 81.56 1.09 105 up osd.58
> > >>>> >> >> >>> >> >>
>>>>> 59 hdd 7.32619 1.00000 7.3 TiB 5.9
TiB 5.9
> > >>>> TiB 429 MiB 15 GiB 1.4
TiB 80.64 1.08 94 up osd.59
> > >>>> >> >> >>> >> >>
>>>>> 60 hdd 7.32619 1.00000 7.3 TiB 5.4
TiB 5.3
> > >>>> TiB 226 MiB 14 GiB 2.0
TiB 73.25 0.98 101 up osd.60
> > >>>> >> >> >>> >> >>
>>>>> 61 hdd 7.32619 1.00000 7.3 TiB 4.8
TiB 4.8
> > >>>> TiB 1.1 GiB 12 GiB 2.5
TiB 65.84 0.88 103 up osd.61
> > >>>> >> >> >>> >> >>
>>>>> 62 hdd 7.32619 1.00000 7.3 TiB 5.6
TiB 5.6
> > >>>> TiB 1.0 GiB 15 GiB 1.7
TiB 76.83 1.03 126 up osd.62
> > >>>> >> >> >>> >> >>
>>>>> TOTAL 674 TiB 501
TiB 473
> > >>>> TiB 96 GiB 1.2 TiB 173
TiB 74.57
> > >>>> >> >> >>> >> >>
>>>>> MIN/MAX VAR: 0.17/1.20 STDDEV: 10.25
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>> Am Sa., 13. März 2021 um 15:57 Uhr schrieb
Dan
> > >>>> van der Ster
<dan(a)vanderster.com>om>:
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>> No, increasing num PGs won't help
substantially.
> > >>>> >> >>
>>> >> >> >>>>>>
> > >>>> >> >> >>> >> >>
>>>>>> Can you share the entire output of ceph
osd df
> > >>>> tree ?
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>> Did you already set
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>> ceph config set mgr
> > >>>> mgr/balancer/upmap_max_deviation 1
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>> ??
> > >>>> >> >> >>> >> >>
>>>>>> And I recommend debug_mgr 4/5 so you can
see
> > >>>> some basic upmap
balancer logging.
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>> .. Dan
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>> On Sat, Mar 13, 2021, 3:49 PM Boris
Behrens <
> > >>>> bb(a)kervyn.de>
wrote:
> > >>>> >> >> >>> >> >>
>>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>> Hello people,
> > >>>> >> >> >>> >> >>
>>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>> I am still struggeling with the balancer
> > >>>> >> >> >>> >> >>
>>>>>>> (
> > >>>>
https://www.mail-archive.com/ceph-users@ceph.io/msg09124.html)
> > >>>> >> >> >>> >> >>
>>>>>>> Now I've read some more and might think
that I
> > >>>> do not have enough
PGs.
> > >>>> >> >> >>> >> >>
>>>>>>> Currently I have 84OSDs and 1024PGs for
the
> > >>>> main pool (3008 total).
I
> > >>>> >> >> >>> >> >>
>>>>>>> have the autoscaler enabled, but I
doesn't tell
> > >>>> me to increase the
> > >>>> >> >> >>> >> >>
>>>>>>> PGs.
> > >>>> >> >> >>> >> >>
>>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>> What do you think?
> > >>>> >> >> >>> >> >>
>>>>>>>
> > >>>> >> >> >>> >> >>
>>>>>>> --
> > >>>> >> >> >>> >> >>
>>>>>>> Die Selbsthilfegruppe "UTF-8-Probleme"
trifft
> > >>>> sich diesmal
abweichend
> > >>>> >> >> >>> >> >>
>>>>>>> im groüen Saal.
> > >>>> >> >> >>> >> >>
>>>>>>>
_______________________________________________
> > >>>> >> >>
>>> >> >> >>>>>>> ceph-users mailing list --
ceph-users(a)ceph.io
> > >>>> >> >>
>>> >> >> >>>>>>> To unsubscribe send an email
to
> > >>>> ceph-users-leave(a)ceph.io
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>>
> > >>>> >> >> >>> >> >>
>>>>> --
> > >>>> >> >> >>> >> >>
>>>>> Die Selbsthilfegruppe "UTF-8-Probleme"
trifft
> > >>>> sich diesmal abweichend
im groüen Saal.
> > >>>> >> >> >>> >> >> >>>
> > >>>> >> >> >>> >> >> >>>
> > >>>> >> >> >>> >> >> >>>
> > >>>> >> >> >>> >> >> >>>
--
> > >>>> >> >> >>> >> >> >>>
Die Selbsthilfegruppe "UTF-8-Probleme"
trifft sich
> > >>>> diesmal abweichend im
groüen Saal.
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> >
> > >>>> >> >> >>> >> >> > --
> > >>>> >> >> >>> >> >> > Die
Selbsthilfegruppe "UTF-8-Probleme" trifft
sich
> > >>>> diesmal abweichend im
groüen Saal.
> > >>>> >> >> >>> >> >
> > >>>> >> >> >>> >> >
> > >>>> >> >> >>> >> >
> > >>>> >> >> >>> >> > --
> > >>>> >> >> >>> >> > Die
Selbsthilfegruppe "UTF-8-Probleme" trifft sich
> > >>>> diesmal abweichend im groüen Saal.
> > >>>> >> >> >>> >
> > >>>> >> >> >>> >
> > >>>> >> >> >>> >
> > >>>> >> >> >>> > --
> > >>>> >> >> >>> > Die Selbsthilfegruppe
"UTF-8-Probleme" trifft sich
diesmal
> > >>>> abweichend im groüen
Saal.
> > >>>> >> >> >>
> > >>>> >> >> >>
> > >>>> >> >> >>
> > >>>> >> >> >> --
> > >>>> >> >> >> Die Selbsthilfegruppe
"UTF-8-Probleme" trifft sich
diesmal
> > >>>> abweichend im groüen
Saal.
> > >>>> >> >> >
> > >>>> >> >> >
> > >>>> >> >> >
> > >>>> >> >> > --
> > >>>> >> >> > Die Selbsthilfegruppe
"UTF-8-Probleme" trifft sich
diesmal
> > >>>> abweichend im groüen
Saal.
> > >>>> >> >
> > >>>> >> >
> > >>>> >> >
> > >>>> >> > --
> > >>>> >> > Die Selbsthilfegruppe "UTF-8-Probleme"
trifft sich diesmal
> > >>>> abweichend im groüen Saal.
> > >>>> >
> > >>>> >
> > >>>> >
> > >>>> > --
> > >>>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft
sich diesmal
abweichend
> > >>>> im groüen Saal.
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich
diesmal
abweichend im
> > >>> groüen Saal.
> > >>>
> > >>
> > >
> > > --
> > > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal
abweichend im
> > > groüen Saal.
> > >
> >
> >
> > --
> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend
im
>
groüen Saal.
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
--
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.