Those osds are intentionally out, yes. (They were drained to be replaced).
I have fixed 2 clusters' stats already with this method ... both had
up but out osds, and stopping the up/out osd fixed the stats.
I opened a tracker for this:
https://tracker.ceph.com/issues/48385
-- dan
On Thu, Nov 26, 2020 at 8:14 PM Igor Fedotov <ifedotov(a)suse.de> wrote:
>
> Also wondering if you have the same "gap" OSDs at different cluster(s)
> which show stats improperly?
>
>
> On 11/26/2020 10:08 PM, Dan van der Ster wrote:
> > Hey that's it!
> >
> > I stopped the up but out OSDs (100 and 177), and now the stats are correct!
> >
> > # ceph df
> > RAW STORAGE:
> > CLASS SIZE AVAIL USED RAW USED %RAW USED
> > hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62
> > TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62
> >
> > POOLS:
> > POOL ID STORED OBJECTS USED %USED MAX AVAIL
> > public 68 2.9 PiB 143.56M 4.3 PiB 84.55 538 TiB
> > test 71 29 MiB 6.56k 1.2 GiB 0 269 TiB
> > foo 72 1.2 GiB 308 3.6 GiB 0 269 TiB
> >
> >
> >
> > On Thu, Nov 26, 2020 at 8:02 PM Dan van der Ster <dan(a)vanderster.com>
wrote:
> >> There are a couple gaps, yes:
https://termbin.com/9mx1
> >>
> >> What should I do?
> >>
> >> -- dan
> >>
> >> On Thu, Nov 26, 2020 at 7:52 PM Igor Fedotov <ifedotov(a)suse.de>
wrote:
> >>> Does "ceph osd df tree" show stats properly (I mean there are
no evident
> >>> gaps like unexpected zero values) for all the daemons?
> >>>
> >>>
> >>>> 1. Anyway, I found something weird...
> >>>>
> >>>> I created a new 1-PG pool "foo" on a different cluster and
wrote some
> >>>> data to it.
> >>>>
> >>>> The stored and used are equal.
> >>>>
> >>>> Thu 26 Nov 19:26:58 CET 2020
> >>>> RAW STORAGE:
> >>>> CLASS SIZE AVAIL USED RAW USED
%RAW USED
> >>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.31
> >>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.31
> >>>>
> >>>> POOLS:
> >>>> POOL ID STORED OBJECTS USED %USED
MAX AVAIL
> >>>> public 68 2.9 PiB 143.54M 2.9 PiB 78.49
538 TiB
> >>>> test 71 29 MiB 6.56k 29 MiB 0
269 TiB
> >>>> foo 72 1.2 GiB 308 1.2 GiB 0
269 TiB
> >>>>
> >>>> But I tried restarting the relevant three OSDs, and the bytes_used
are
> >>>> temporarily reported correctly:
> >>>>
> >>>> Thu 26 Nov 19:27:00 CET 2020
> >>>> RAW STORAGE:
> >>>> CLASS SIZE AVAIL USED RAW USED
%RAW USED
> >>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.62
> >>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.62
> >>>>
> >>>> POOLS:
> >>>> POOL ID STORED OBJECTS USED %USED
MAX AVAIL
> >>>> public 68 2.9 PiB 143.54M 4.3 PiB 84.55
538 TiB
> >>>> test 71 29 MiB 6.56k 1.2 GiB 0
269 TiB
> >>>> foo 72 1.2 GiB 308 3.6 GiB 0
269 TiB
> >>>>
> >>>> But then a few seconds later it's back to used == stored:
> >>>>
> >>>> Thu 26 Nov 19:27:03 CET 2020
> >>>> RAW STORAGE:
> >>>> CLASS SIZE AVAIL USED RAW USED
%RAW USED
> >>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.47
> >>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.47
> >>>>
> >>>> POOLS:
> >>>> POOL ID STORED OBJECTS USED %USED
MAX AVAIL
> >>>> public 68 2.9 PiB 143.54M 2.9 PiB 78.49
538 TiB
> >>>> test 71 29 MiB 6.56k 29 MiB 0
269 TiB
> >>>> foo 72 1.2 GiB 308 1.2 GiB 0
269 TiB
> >>>>
> >>>> It seems to report the correct stats only when the PG is peering
(so
> >>>> some other transition state).
> >>>> I've restarted all three relevant OSDs now -- the stats are
reported
> >>>> as stored == used.
> >>>>
> >>>> 2. Another data point -- I found another old cluster that reports
> >>>> stored/used correctly. I have no idea what might be different about
> >>>> that cluster -- we updated it just like the others.
> >>>>
> >>>> Cheers, Dan
> >>>>
> >>>> On Thu, Nov 26, 2020 at 6:22 PM Igor Fedotov
<ifedotov(a)suse.de> wrote:
> >>>>> For specific BlueStore instance you can learn relevant statfs
output by
> >>>>>
> >>>>> setting debug_bluestore to 20 and leaving OSD for 5-10 seconds
(or may
> >>>>> be a couple of minutes - don't remember exact statsfs poll
period ).
> >>>>>
> >>>>> Then grep osd log for "statfs" and/or
"pool_statfs" and get the output
> >>>>> formatted as per the following operator (taken from
src/osd/osd_types.cc):
> >>>>>
> >>>>> ostream& operator<<(ostream& out, const
store_statfs_t &s)
> >>>>> {
> >>>>> out << std::hex
> >>>>> << "store_statfs(0x" <<
s.available
> >>>>> << "/0x" <<
s.internally_reserved
> >>>>> << "/0x" << s.total
> >>>>> << ", data 0x" << s.data_stored
> >>>>> << "/0x" << s.allocated
> >>>>> << ", compress 0x" <<
s.data_compressed
> >>>>> << "/0x" <<
s.data_compressed_allocated
> >>>>> << "/0x" <<
s.data_compressed_original
> >>>>> << ", omap 0x" <<
s.omap_allocated
> >>>>> << ", meta 0x" <<
s.internal_metadata
> >>>>> << std::dec
> >>>>> << ")";
> >>>>> return out;
> >>>>> }
> >>>>>
> >>>>> But honestly I doubt this is BlueStore which reports incorrectly
since
> >>>>> it doesn't care about replication.
> >>>>>
> >>>>> It rather looks like lack of stats from some replicas or
improper pg
> >>>>> replica factor processing...
> >>>>>
> >>>>> Perhaps legacy vs. new pool what matters... Can you try to
create a new
> >>>>> pool at old cluster and fill it with some data (e.g. just a
single 64K
> >>>>> object) and check the stats?
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Igor
> >>>>>
> >>>>> On 11/26/2020 8:00 PM, Dan van der Ster wrote:
> >>>>>> Hi Igor,
> >>>>>>
> >>>>>> No BLUESTORE_LEGACY_STATFS warning, and
> >>>>>> bluestore_warn_on_legacy_statfs is the default true on this
(and all)
> >>>>>> clusters.
> >>>>>> I'm quite sure we did the statfs conversion during one
of the recent
> >>>>>> upgrades (I forget which one exactly).
> >>>>>>
> >>>>>> # ceph tell osd.* config get bluestore_warn_on_legacy_statfs
| grep -v true
> >>>>>> #
> >>>>>>
> >>>>>> Is there a command to see the statfs reported by an
individual OSD ?
> >>>>>> We have a mix of ~year old and recently recreated OSDs, so I
could try
> >>>>>> to see if they differ.
> >>>>>>
> >>>>>> Thanks!
> >>>>>>
> >>>>>> Dan
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Nov 26, 2020 at 5:50 PM Igor Fedotov
<ifedotov(a)suse.de> wrote:
> >>>>>>> Hi Dan
> >>>>>>>
> >>>>>>> don't you have BLUESTORE_LEGACY_STATFS alert raised
(might be silenced
> >>>>>>> by bluestore_warn_on_legacy_statfs param) for the older
cluster?
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Igor
> >>>>>>>
> >>>>>>>
> >>>>>>> On 11/26/2020 7:29 PM, Dan van der Ster wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> Depending on which cluster I look at (all running
v14.2.11), the
> >>>>>>>> bytes_used is reporting raw space or stored bytes
variably.
> >>>>>>>>
> >>>>>>>> Here's a 7 year old cluster:
> >>>>>>>>
> >>>>>>>> # ceph df -f json | jq .pools[0]
> >>>>>>>> {
> >>>>>>>> "name": "volumes",
> >>>>>>>> "id": 4,
> >>>>>>>> "stats": {
> >>>>>>>> "stored": 1229308190855881,
> >>>>>>>> "objects": 294401604,
> >>>>>>>> "kb_used": 1200496280133,
> >>>>>>>> "bytes_used": 1229308190855881,
> >>>>>>>> "percent_used":
0.4401889145374298,
> >>>>>>>> "max_avail": 521125025021952
> >>>>>>>> }
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> Note that stored == bytes_used for that pool. (this
is a 3x replica pool).
> >>>>>>>>
> >>>>>>>> But here's a newer cluster (installed recently
with nautilus)
> >>>>>>>>
> >>>>>>>> # ceph df -f json | jq .pools[0]
> >>>>>>>> {
> >>>>>>>> "name": "volumes",
> >>>>>>>> "id": 1,
> >>>>>>>> "stats": {
> >>>>>>>> "stored": 680977600893041,
> >>>>>>>> "objects": 163155803,
> >>>>>>>> "kb_used": 1995736271829,
> >>>>>>>> "bytes_used": 2043633942351985,
> >>>>>>>> "percent_used":
0.23379847407341003,
> >>>>>>>> "max_avail": 2232457428467712
> >>>>>>>> }
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> In the second cluster, bytes_used is 3x stored.
> >>>>>>>>
> >>>>>>>> Does anyone know why these are not reported
consistently?
> >>>>>>>> Noticing this just now, I'll update our
monitoring to plot stored
> >>>>>>>> rather than bytes_used from now on.
> >>>>>>>>
> >>>>>>>> Thanks!
> >>>>>>>>
> >>>>>>>> Dan
> >>>>>>>> _______________________________________________
> >>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
> >>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io