OK, cool!
Will try to reproduce this locally tomorrow...
Thanks,
Igor
On 11/26/2020 10:19 PM, Dan van der Ster wrote:
> Those osds are intentionally out, yes. (They were drained to be replaced).
>
> I have fixed 2 clusters' stats already with this method ... both had
> up but out osds, and stopping the up/out osd fixed the stats.
>
> I opened a tracker for this:
https://tracker.ceph.com/issues/48385
>
> -- dan
>
> On Thu, Nov 26, 2020 at 8:14 PM Igor Fedotov <ifedotov(a)suse.de> wrote:
>> Also wondering if you have the same "gap" OSDs at different cluster(s)
>> which show stats improperly?
>>
>>
>> On 11/26/2020 10:08 PM, Dan van der Ster wrote:
>>> Hey that's it!
>>>
>>> I stopped the up but out OSDs (100 and 177), and now the stats are correct!
>>>
>>> # ceph df
>>> RAW STORAGE:
>>> CLASS SIZE AVAIL USED RAW USED %RAW USED
>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62
>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62
>>>
>>> POOLS:
>>> POOL ID STORED OBJECTS USED %USED MAX
AVAIL
>>> public 68 2.9 PiB 143.56M 4.3 PiB 84.55 538
TiB
>>> test 71 29 MiB 6.56k 1.2 GiB 0 269
TiB
>>> foo 72 1.2 GiB 308 3.6 GiB 0 269
TiB
>>>
>>>
>>>
>>> On Thu, Nov 26, 2020 at 8:02 PM Dan van der Ster <dan(a)vanderster.com>
wrote:
>>>> There are a couple gaps, yes:
https://termbin.com/9mx1
>>>>
>>>> What should I do?
>>>>
>>>> -- dan
>>>>
>>>> On Thu, Nov 26, 2020 at 7:52 PM Igor Fedotov <ifedotov(a)suse.de>
wrote:
>>>>> Does "ceph osd df tree" show stats properly (I mean there
are no evident
>>>>> gaps like unexpected zero values) for all the daemons?
>>>>>
>>>>>
>>>>>> 1. Anyway, I found something weird...
>>>>>>
>>>>>> I created a new 1-PG pool "foo" on a different cluster
and wrote some
>>>>>> data to it.
>>>>>>
>>>>>> The stored and used are equal.
>>>>>>
>>>>>> Thu 26 Nov 19:26:58 CET 2020
>>>>>> RAW STORAGE:
>>>>>> CLASS SIZE AVAIL USED RAW USED
%RAW USED
>>>>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.31
>>>>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.31
>>>>>>
>>>>>> POOLS:
>>>>>> POOL ID STORED OBJECTS USED
%USED MAX AVAIL
>>>>>> public 68 2.9 PiB 143.54M 2.9 PiB
78.49 538 TiB
>>>>>> test 71 29 MiB 6.56k 29 MiB
0 269 TiB
>>>>>> foo 72 1.2 GiB 308 1.2 GiB
0 269 TiB
>>>>>>
>>>>>> But I tried restarting the relevant three OSDs, and the
bytes_used are
>>>>>> temporarily reported correctly:
>>>>>>
>>>>>> Thu 26 Nov 19:27:00 CET 2020
>>>>>> RAW STORAGE:
>>>>>> CLASS SIZE AVAIL USED RAW USED
%RAW USED
>>>>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.62
>>>>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.62
>>>>>>
>>>>>> POOLS:
>>>>>> POOL ID STORED OBJECTS USED
%USED MAX AVAIL
>>>>>> public 68 2.9 PiB 143.54M 4.3 PiB
84.55 538 TiB
>>>>>> test 71 29 MiB 6.56k 1.2 GiB
0 269 TiB
>>>>>> foo 72 1.2 GiB 308 3.6 GiB
0 269 TiB
>>>>>>
>>>>>> But then a few seconds later it's back to used == stored:
>>>>>>
>>>>>> Thu 26 Nov 19:27:03 CET 2020
>>>>>> RAW STORAGE:
>>>>>> CLASS SIZE AVAIL USED RAW USED
%RAW USED
>>>>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.47
>>>>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.47
>>>>>>
>>>>>> POOLS:
>>>>>> POOL ID STORED OBJECTS USED
%USED MAX AVAIL
>>>>>> public 68 2.9 PiB 143.54M 2.9 PiB
78.49 538 TiB
>>>>>> test 71 29 MiB 6.56k 29 MiB
0 269 TiB
>>>>>> foo 72 1.2 GiB 308 1.2 GiB
0 269 TiB
>>>>>>
>>>>>> It seems to report the correct stats only when the PG is peering
(so
>>>>>> some other transition state).
>>>>>> I've restarted all three relevant OSDs now -- the stats are
reported
>>>>>> as stored == used.
>>>>>>
>>>>>> 2. Another data point -- I found another old cluster that
reports
>>>>>> stored/used correctly. I have no idea what might be different
about
>>>>>> that cluster -- we updated it just like the others.
>>>>>>
>>>>>> Cheers, Dan
>>>>>>
>>>>>> On Thu, Nov 26, 2020 at 6:22 PM Igor Fedotov
<ifedotov(a)suse.de> wrote:
>>>>>>> For specific BlueStore instance you can learn relevant statfs
output by
>>>>>>>
>>>>>>> setting debug_bluestore to 20 and leaving OSD for 5-10
seconds (or may
>>>>>>> be a couple of minutes - don't remember exact statsfs
poll period ).
>>>>>>>
>>>>>>> Then grep osd log for "statfs" and/or
"pool_statfs" and get the output
>>>>>>> formatted as per the following operator (taken from
src/osd/osd_types.cc):
>>>>>>>
>>>>>>> ostream& operator<<(ostream& out, const
store_statfs_t &s)
>>>>>>> {
>>>>>>> out << std::hex
>>>>>>> << "store_statfs(0x" <<
s.available
>>>>>>> << "/0x" <<
s.internally_reserved
>>>>>>> << "/0x" << s.total
>>>>>>> << ", data 0x" <<
s.data_stored
>>>>>>> << "/0x" << s.allocated
>>>>>>> << ", compress 0x" <<
s.data_compressed
>>>>>>> << "/0x" <<
s.data_compressed_allocated
>>>>>>> << "/0x" <<
s.data_compressed_original
>>>>>>> << ", omap 0x" <<
s.omap_allocated
>>>>>>> << ", meta 0x" <<
s.internal_metadata
>>>>>>> << std::dec
>>>>>>> << ")";
>>>>>>> return out;
>>>>>>> }
>>>>>>>
>>>>>>> But honestly I doubt this is BlueStore which reports
incorrectly since
>>>>>>> it doesn't care about replication.
>>>>>>>
>>>>>>> It rather looks like lack of stats from some replicas or
improper pg
>>>>>>> replica factor processing...
>>>>>>>
>>>>>>> Perhaps legacy vs. new pool what matters... Can you try to
create a new
>>>>>>> pool at old cluster and fill it with some data (e.g. just a
single 64K
>>>>>>> object) and check the stats?
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Igor
>>>>>>>
>>>>>>> On 11/26/2020 8:00 PM, Dan van der Ster wrote:
>>>>>>>> Hi Igor,
>>>>>>>>
>>>>>>>> No BLUESTORE_LEGACY_STATFS warning, and
>>>>>>>> bluestore_warn_on_legacy_statfs is the default true on
this (and all)
>>>>>>>> clusters.
>>>>>>>> I'm quite sure we did the statfs conversion during
one of the recent
>>>>>>>> upgrades (I forget which one exactly).
>>>>>>>>
>>>>>>>> # ceph tell osd.* config get
bluestore_warn_on_legacy_statfs | grep -v true
>>>>>>>> #
>>>>>>>>
>>>>>>>> Is there a command to see the statfs reported by an
individual OSD ?
>>>>>>>> We have a mix of ~year old and recently recreated OSDs,
so I could try
>>>>>>>> to see if they differ.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Nov 26, 2020 at 5:50 PM Igor Fedotov
<ifedotov(a)suse.de> wrote:
>>>>>>>>> Hi Dan
>>>>>>>>>
>>>>>>>>> don't you have BLUESTORE_LEGACY_STATFS alert
raised (might be silenced
>>>>>>>>> by bluestore_warn_on_legacy_statfs param) for the
older cluster?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Igor
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 11/26/2020 7:29 PM, Dan van der Ster wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Depending on which cluster I look at (all running
v14.2.11), the
>>>>>>>>>> bytes_used is reporting raw space or stored bytes
variably.
>>>>>>>>>>
>>>>>>>>>> Here's a 7 year old cluster:
>>>>>>>>>>
>>>>>>>>>> # ceph df -f json | jq .pools[0]
>>>>>>>>>> {
>>>>>>>>>> "name": "volumes",
>>>>>>>>>> "id": 4,
>>>>>>>>>> "stats": {
>>>>>>>>>> "stored": 1229308190855881,
>>>>>>>>>> "objects": 294401604,
>>>>>>>>>> "kb_used": 1200496280133,
>>>>>>>>>> "bytes_used":
1229308190855881,
>>>>>>>>>> "percent_used":
0.4401889145374298,
>>>>>>>>>> "max_avail": 521125025021952
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> Note that stored == bytes_used for that pool.
(this is a 3x replica pool).
>>>>>>>>>>
>>>>>>>>>> But here's a newer cluster (installed
recently with nautilus)
>>>>>>>>>>
>>>>>>>>>> # ceph df -f json | jq .pools[0]
>>>>>>>>>> {
>>>>>>>>>> "name": "volumes",
>>>>>>>>>> "id": 1,
>>>>>>>>>> "stats": {
>>>>>>>>>> "stored": 680977600893041,
>>>>>>>>>> "objects": 163155803,
>>>>>>>>>> "kb_used": 1995736271829,
>>>>>>>>>> "bytes_used":
2043633942351985,
>>>>>>>>>> "percent_used":
0.23379847407341003,
>>>>>>>>>> "max_avail": 2232457428467712
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> In the second cluster, bytes_used is 3x stored.
>>>>>>>>>>
>>>>>>>>>> Does anyone know why these are not reported
consistently?
>>>>>>>>>> Noticing this just now, I'll update our
monitoring to plot stored
>>>>>>>>>> rather than bytes_used from now on.
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io