Well, backfilling sure, but will it allow me to actually change the pgp_num
as more space frees up? Because the issue is that I cannot modify that
value.
Thanks,
Mac Wynkoop, Senior Datacenter Engineer
*NetDepot.com:* Cloud Servers; Delivered
Houston | Atlanta | NYC | Colorado Springs
1-844-25-CLOUD Ext 806
On Wed, Oct 7, 2020 at 1:50 PM Eugen Block <eblock(a)nde.ag> wrote:
Yes, I think that’s exactly the reason. As soon as the
cluster has
more space the backfill will continue.
Zitat von Mac Wynkoop <mwynkoop(a)netdepot.com>om>:
The cluster is currently in a warn state,
here's the scrubbed output of
ceph -s:
*cluster: id: *redacted* health: HEALTH_WARN
noscrub,nodeep-scrub flag(s) set 22 nearfull osd(s)
2
pool(s) nearfull Low space hindering
backfill (add storage if
this doesn't resolve itself): 277 pgs backfill_toofull
Degraded
data redundancy: 32652738/3651947772 objects
degraded (0.894%), 281 pgs
degraded, 341 pgs undersized 1214 pgs not deep-scrubbed in
time
2647 pgs not scrubbed in time
2 daemons have
recently
crashed services: mon: 5 daemons,
*redacted* (age 44h)
mgr:
*redacted* osd: 162 osds: 162
up (since 44h), 162 in
(since 4d); 971 remapped pgs flags noscrub,nodeep-scrub
rgw: 3 daemons active *redacted* tcmu-runner: 18 daemons
active
*redacted* data: pools: 10 pools, 2648 pgs
objects: 409.56M
objects, 738 TiB usage: 1.3 PiB used, 580 TiB / 1.8 PiB avail
pgs:
32652738/3651947772 objects degraded
(0.894%)
517370913/3651947772 objects misplaced (14.167%) 1677
active+clean 477 active+remapped+backfill_wait
100
active+remapped+backfill_wait+backfill_toofull
80
active+undersized+degraded+remapped+backfill_wait 60
active+undersized+degraded+remapped+backfill_wait+backfill_toofull
42 active+undersized+degraded+remapped+backfill_toofull
33
active+undersized+degraded+remapped+backfilling
25
active+remapped+backfilling 25
active+remapped+backfill_toofull 24
active+undersized+remapped+backfilling 23
active+forced_recovery+undersized+degraded+remapped+backfill_wait
19
active+forced_recovery+undersized+degraded+remapped+backfill_wait+backfill_toofull
15
active+undersized+remapped+backfill_wait 14
active+undersized+remapped+backfill_wait+backfill_toofull 12
active+forced_recovery+undersized+degraded+remapped+backfill_toofull
12 active+forced_recovery+undersized+degraded+remapped+backfilling
5 active+undersized+remapped+backfill_toofull 3
active+remapped 1 active+undersized+remapped
1
active+forced_recovery+undersized+remapped+backfilling io:
client:
287 MiB/s rd, 40 MiB/s wr, 1.94k op/s rd, 165
op/s wr recovery: 425
MiB/s, 225 objects/s*
Now as you can see, we do have a lot of backfill operations going on at
the
moment. Does that actually prevent Ceph from
modifying the pgp_num value
of
a pool?
Thanks,
Mac Wynkoop
On Wed, Oct 7, 2020 at 8:57 AM Eugen Block <eblock(a)nde.ag> wrote:
> What is the current cluster status, is it healthy? Maybe increasing
> pg_num would hit the limit of mon_max_pg_per_osd? Can you share 'ceph
> -s' output?
>
>
> Zitat von Mac Wynkoop <mwynkoop(a)netdepot.com>om>:
>
> > Right, both Norman and I set the pg_num before the pgp_num. For
example,
> > here is my current pool settings:
> >
> >
> > *"pool 40 '*redacted*.rgw.buckets.data' erasure size 9 min_size 7
> > crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1024
pgp_num_target
> > 2048 last_change 8458830 lfor
0/0/8445757 flags
> > hashpspool,ec_overwrites,nodelete,backfillfull stripe_width 24576
> fast_read
> > 1 application rgw"*
> > So, when I set:
> >
> > "*ceph osd pool set hou-ec-1.rgw.buckets.data pgp_num 2048*"
> >
> > it returns:
> >
> > "*set pool 40 pgp_num to 2048*"
> >
> > But upon checking the pool details again:
> >
> > "*pool 40 '*redacted*.rgw.buckets.data' erasure size 9 min_size 7
> > crush_rule 2 object_hash rjenkins pg_num 2048 pgp_num 1024
pgp_num_target
> > 2048 last_change 8458870 lfor
0/0/8445757 flags
> > hashpspool,ec_overwrites,nodelete,backfillfull stripe_width 24576
> fast_read
> > 1 application rgw*"
> >
> > and the pgp_num value does not increase. Am I just doing something
> > totally wrong?
> >
> > Thanks,
> > Mac Wynkoop
> >
> >
> >
> >
> > On Tue, Oct 6, 2020 at 2:32 PM Marc Roos <M.Roos(a)f1-outsourcing.eu>
> wrote:
> >
> >> pg_num and pgp_num need to be the same, not?
> >>
> >> 3.5.1. Set the Number of PGs
> >>
> >> To set the number of placement groups in a pool, you must specify the
> >> number of placement groups at the time you create the pool. See
Create a
> >> Pool for details. Once you set
placement groups for a pool, you can
> >> increase the number of placement groups (but you cannot decrease the
> >> number of placement groups). To increase the number of placement
groups,
> >> execute the following:
> >>
> >> ceph osd pool set {pool-name} pg_num {pg_num}
> >>
> >> Once you increase the number of placement groups, you must also
increase
> >> the number of placement groups for
placement (pgp_num) before your
> >> cluster will rebalance. The pgp_num should be equal to the pg_num. To
> >> increase the number of placement groups for placement, execute the
> >> following:
> >>
> >> ceph osd pool set {pool-name} pgp_num {pgp_num}
> >>
> >>
> >>
>
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/s…
> >>
> >> -----Original Message-----
> >> To: norman
> >> Cc: ceph-users
> >> Subject: [ceph-users] Re: pool pgp_num not updated
> >>
> >> Hi everyone,
> >>
> >> I'm seeing a similar issue here. Any ideas on this?
> >> Mac Wynkoop,
> >>
> >>
> >>
> >> On Sun, Sep 6, 2020 at 11:09 PM norman <norman.kern(a)gmx.com> wrote:
> >>
> >> > Hi guys,
> >> >
> >> > When I update the pg_num of a pool, I found it not worked(no
> >> > rebalanced), anyone know the reason? Pool's info:
> >> >
> >> > pool 21 'openstack-volumes-rs' replicated size 3 min_size 2
crush_rule
> >> > 21 object_hash rjenkins pg_num
1024 pgp_num 512 pgp_num_target 1024
> >> > autoscale_mode warn last_change 85103 lfor 82044/82044/82044 flags
> >> > hashpspool,nodelete,selfmanaged_snaps stripe_width 0 application
rbd
> >> > removed_snaps
> >> > [1~1e6,1e8~300,4e9~18,502~3f,542~11,554~1a,56f~1d7]
> >> > pool 22 'openstack-vms-rs' replicated size 3 min_size 2
crush_rule
22
> >> > object_hash rjenkins pg_num 512
pgp_num 512 pg_num_target 256
> >> > pgp_num_target 256 autoscale_mode warn last_change 84769 lfor
> >> > 0/0/55294 flags hashpspool,nodelete,selfmanaged_snaps stripe_width
0
> >> > application rbd
> >> >
> >> > The pgp_num_target is set, but pgp_num not set.
> >> >
> >> > I have scale out new OSDs and is backfilling before setting the
value,
> >>
> >> > is it the reason?
> >> > _______________________________________________
> >> > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send
an
> >> > email to
ceph-users-leave(a)ceph.io
> >> >
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
> >> email to ceph-users-leave(a)ceph.io
> >>
> >>
> >>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>