Also wondering if you have the same "gap" OSDs at different cluster(s)
which show stats improperly?
On 11/26/2020 10:08 PM, Dan van der Ster wrote:
> Hey that's it!
>
> I stopped the up but out OSDs (100 and 177), and now the stats are correct!
>
> # ceph df
> RAW STORAGE:
> CLASS SIZE AVAIL USED RAW USED %RAW USED
> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62
> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62
>
> POOLS:
> POOL ID STORED OBJECTS USED %USED MAX AVAIL
> public 68 2.9 PiB 143.56M 4.3 PiB 84.55 538 TiB
> test 71 29 MiB 6.56k 1.2 GiB 0 269 TiB
> foo 72 1.2 GiB 308 3.6 GiB 0 269 TiB
>
>
>
> On Thu, Nov 26, 2020 at 8:02 PM Dan van der Ster <dan(a)vanderster.com> wrote:
>> There are a couple gaps, yes:
https://termbin.com/9mx1
>>
>> What should I do?
>>
>> -- dan
>>
>> On Thu, Nov 26, 2020 at 7:52 PM Igor Fedotov <ifedotov(a)suse.de> wrote:
>>> Does "ceph osd df tree" show stats properly (I mean there are no
evident
>>> gaps like unexpected zero values) for all the daemons?
>>>
>>>
>>>> 1. Anyway, I found something weird...
>>>>
>>>> I created a new 1-PG pool "foo" on a different cluster and
wrote some
>>>> data to it.
>>>>
>>>> The stored and used are equal.
>>>>
>>>> Thu 26 Nov 19:26:58 CET 2020
>>>> RAW STORAGE:
>>>> CLASS SIZE AVAIL USED RAW USED %RAW
USED
>>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.31
>>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.31
>>>>
>>>> POOLS:
>>>> POOL ID STORED OBJECTS USED %USED MAX
AVAIL
>>>> public 68 2.9 PiB 143.54M 2.9 PiB 78.49
538 TiB
>>>> test 71 29 MiB 6.56k 29 MiB 0
269 TiB
>>>> foo 72 1.2 GiB 308 1.2 GiB 0
269 TiB
>>>>
>>>> But I tried restarting the relevant three OSDs, and the bytes_used are
>>>> temporarily reported correctly:
>>>>
>>>> Thu 26 Nov 19:27:00 CET 2020
>>>> RAW STORAGE:
>>>> CLASS SIZE AVAIL USED RAW USED %RAW
USED
>>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.62
>>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.62
>>>>
>>>> POOLS:
>>>> POOL ID STORED OBJECTS USED %USED MAX
AVAIL
>>>> public 68 2.9 PiB 143.54M 4.3 PiB 84.55
538 TiB
>>>> test 71 29 MiB 6.56k 1.2 GiB 0
269 TiB
>>>> foo 72 1.2 GiB 308 3.6 GiB 0
269 TiB
>>>>
>>>> But then a few seconds later it's back to used == stored:
>>>>
>>>> Thu 26 Nov 19:27:03 CET 2020
>>>> RAW STORAGE:
>>>> CLASS SIZE AVAIL USED RAW USED %RAW
USED
>>>> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.47
>>>> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB
78.47
>>>>
>>>> POOLS:
>>>> POOL ID STORED OBJECTS USED %USED MAX
AVAIL
>>>> public 68 2.9 PiB 143.54M 2.9 PiB 78.49
538 TiB
>>>> test 71 29 MiB 6.56k 29 MiB 0
269 TiB
>>>> foo 72 1.2 GiB 308 1.2 GiB 0
269 TiB
>>>>
>>>> It seems to report the correct stats only when the PG is peering (so
>>>> some other transition state).
>>>> I've restarted all three relevant OSDs now -- the stats are reported
>>>> as stored == used.
>>>>
>>>> 2. Another data point -- I found another old cluster that reports
>>>> stored/used correctly. I have no idea what might be different about
>>>> that cluster -- we updated it just like the others.
>>>>
>>>> Cheers, Dan
>>>>
>>>> On Thu, Nov 26, 2020 at 6:22 PM Igor Fedotov <ifedotov(a)suse.de>
wrote:
>>>>> For specific BlueStore instance you can learn relevant statfs output
by
>>>>>
>>>>> setting debug_bluestore to 20 and leaving OSD for 5-10 seconds (or
may
>>>>> be a couple of minutes - don't remember exact statsfs poll period
).
>>>>>
>>>>> Then grep osd log for "statfs" and/or
"pool_statfs" and get the output
>>>>> formatted as per the following operator (taken from
src/osd/osd_types.cc):
>>>>>
>>>>> ostream& operator<<(ostream& out, const store_statfs_t
&s)
>>>>> {
>>>>> out << std::hex
>>>>> << "store_statfs(0x" << s.available
>>>>> << "/0x" << s.internally_reserved
>>>>> << "/0x" << s.total
>>>>> << ", data 0x" << s.data_stored
>>>>> << "/0x" << s.allocated
>>>>> << ", compress 0x" <<
s.data_compressed
>>>>> << "/0x" <<
s.data_compressed_allocated
>>>>> << "/0x" <<
s.data_compressed_original
>>>>> << ", omap 0x" << s.omap_allocated
>>>>> << ", meta 0x" << s.internal_metadata
>>>>> << std::dec
>>>>> << ")";
>>>>> return out;
>>>>> }
>>>>>
>>>>> But honestly I doubt this is BlueStore which reports incorrectly
since
>>>>> it doesn't care about replication.
>>>>>
>>>>> It rather looks like lack of stats from some replicas or improper pg
>>>>> replica factor processing...
>>>>>
>>>>> Perhaps legacy vs. new pool what matters... Can you try to create a
new
>>>>> pool at old cluster and fill it with some data (e.g. just a single
64K
>>>>> object) and check the stats?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Igor
>>>>>
>>>>> On 11/26/2020 8:00 PM, Dan van der Ster wrote:
>>>>>> Hi Igor,
>>>>>>
>>>>>> No BLUESTORE_LEGACY_STATFS warning, and
>>>>>> bluestore_warn_on_legacy_statfs is the default true on this (and
all)
>>>>>> clusters.
>>>>>> I'm quite sure we did the statfs conversion during one of the
recent
>>>>>> upgrades (I forget which one exactly).
>>>>>>
>>>>>> # ceph tell osd.* config get bluestore_warn_on_legacy_statfs |
grep -v true
>>>>>> #
>>>>>>
>>>>>> Is there a command to see the statfs reported by an individual
OSD ?
>>>>>> We have a mix of ~year old and recently recreated OSDs, so I
could try
>>>>>> to see if they differ.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 26, 2020 at 5:50 PM Igor Fedotov
<ifedotov(a)suse.de> wrote:
>>>>>>> Hi Dan
>>>>>>>
>>>>>>> don't you have BLUESTORE_LEGACY_STATFS alert raised
(might be silenced
>>>>>>> by bluestore_warn_on_legacy_statfs param) for the older
cluster?
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Igor
>>>>>>>
>>>>>>>
>>>>>>> On 11/26/2020 7:29 PM, Dan van der Ster wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Depending on which cluster I look at (all running
v14.2.11), the
>>>>>>>> bytes_used is reporting raw space or stored bytes
variably.
>>>>>>>>
>>>>>>>> Here's a 7 year old cluster:
>>>>>>>>
>>>>>>>> # ceph df -f json | jq .pools[0]
>>>>>>>> {
>>>>>>>> "name": "volumes",
>>>>>>>> "id": 4,
>>>>>>>> "stats": {
>>>>>>>> "stored": 1229308190855881,
>>>>>>>> "objects": 294401604,
>>>>>>>> "kb_used": 1200496280133,
>>>>>>>> "bytes_used": 1229308190855881,
>>>>>>>> "percent_used": 0.4401889145374298,
>>>>>>>> "max_avail": 521125025021952
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> Note that stored == bytes_used for that pool. (this is a
3x replica pool).
>>>>>>>>
>>>>>>>> But here's a newer cluster (installed recently with
nautilus)
>>>>>>>>
>>>>>>>> # ceph df -f json | jq .pools[0]
>>>>>>>> {
>>>>>>>> "name": "volumes",
>>>>>>>> "id": 1,
>>>>>>>> "stats": {
>>>>>>>> "stored": 680977600893041,
>>>>>>>> "objects": 163155803,
>>>>>>>> "kb_used": 1995736271829,
>>>>>>>> "bytes_used": 2043633942351985,
>>>>>>>> "percent_used": 0.23379847407341003,
>>>>>>>> "max_avail": 2232457428467712
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> In the second cluster, bytes_used is 3x stored.
>>>>>>>>
>>>>>>>> Does anyone know why these are not reported
consistently?
>>>>>>>> Noticing this just now, I'll update our monitoring to
plot stored
>>>>>>>> rather than bytes_used from now on.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Dan
>>>>>>>> _______________________________________________
>>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io