Does "ceph osd df tree" show stats properly (I mean there are no evident
gaps like unexpected zero values) for all the daemons?
> 1. Anyway, I found something weird...
>
> I created a new 1-PG pool "foo" on a different cluster and wrote some
> data to it.
>
> The stored and used are equal.
>
> Thu 26 Nov 19:26:58 CET 2020
> RAW STORAGE:
> CLASS SIZE AVAIL USED RAW USED %RAW USED
> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.31
> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.31
>
> POOLS:
> POOL ID STORED OBJECTS USED %USED MAX AVAIL
> public 68 2.9 PiB 143.54M 2.9 PiB 78.49 538 TiB
> test 71 29 MiB 6.56k 29 MiB 0 269 TiB
> foo 72 1.2 GiB 308 1.2 GiB 0 269 TiB
>
> But I tried restarting the relevant three OSDs, and the bytes_used are
> temporarily reported correctly:
>
> Thu 26 Nov 19:27:00 CET 2020
> RAW STORAGE:
> CLASS SIZE AVAIL USED RAW USED %RAW USED
> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62
> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.62
>
> POOLS:
> POOL ID STORED OBJECTS USED %USED MAX AVAIL
> public 68 2.9 PiB 143.54M 4.3 PiB 84.55 538 TiB
> test 71 29 MiB 6.56k 1.2 GiB 0 269 TiB
> foo 72 1.2 GiB 308 3.6 GiB 0 269 TiB
>
> But then a few seconds later it's back to used == stored:
>
> Thu 26 Nov 19:27:03 CET 2020
> RAW STORAGE:
> CLASS SIZE AVAIL USED RAW USED %RAW USED
> hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.47
> TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.47
>
> POOLS:
> POOL ID STORED OBJECTS USED %USED MAX AVAIL
> public 68 2.9 PiB 143.54M 2.9 PiB 78.49 538 TiB
> test 71 29 MiB 6.56k 29 MiB 0 269 TiB
> foo 72 1.2 GiB 308 1.2 GiB 0 269 TiB
>
> It seems to report the correct stats only when the PG is peering (so
> some other transition state).
> I've restarted all three relevant OSDs now -- the stats are reported
> as stored == used.
>
> 2. Another data point -- I found another old cluster that reports
> stored/used correctly. I have no idea what might be different about
> that cluster -- we updated it just like the others.
>
> Cheers, Dan
>
> On Thu, Nov 26, 2020 at 6:22 PM Igor Fedotov <ifedotov(a)suse.de> wrote:
>> For specific BlueStore instance you can learn relevant statfs output by
>>
>> setting debug_bluestore to 20 and leaving OSD for 5-10 seconds (or may
>> be a couple of minutes - don't remember exact statsfs poll period ).
>>
>> Then grep osd log for "statfs" and/or "pool_statfs" and get
the output
>> formatted as per the following operator (taken from src/osd/osd_types.cc):
>>
>> ostream& operator<<(ostream& out, const store_statfs_t &s)
>> {
>> out << std::hex
>> << "store_statfs(0x" << s.available
>> << "/0x" << s.internally_reserved
>> << "/0x" << s.total
>> << ", data 0x" << s.data_stored
>> << "/0x" << s.allocated
>> << ", compress 0x" << s.data_compressed
>> << "/0x" << s.data_compressed_allocated
>> << "/0x" << s.data_compressed_original
>> << ", omap 0x" << s.omap_allocated
>> << ", meta 0x" << s.internal_metadata
>> << std::dec
>> << ")";
>> return out;
>> }
>>
>> But honestly I doubt this is BlueStore which reports incorrectly since
>> it doesn't care about replication.
>>
>> It rather looks like lack of stats from some replicas or improper pg
>> replica factor processing...
>>
>> Perhaps legacy vs. new pool what matters... Can you try to create a new
>> pool at old cluster and fill it with some data (e.g. just a single 64K
>> object) and check the stats?
>>
>>
>> Thanks,
>>
>> Igor
>>
>> On 11/26/2020 8:00 PM, Dan van der Ster wrote:
>>> Hi Igor,
>>>
>>> No BLUESTORE_LEGACY_STATFS warning, and
>>> bluestore_warn_on_legacy_statfs is the default true on this (and all)
>>> clusters.
>>> I'm quite sure we did the statfs conversion during one of the recent
>>> upgrades (I forget which one exactly).
>>>
>>> # ceph tell osd.* config get bluestore_warn_on_legacy_statfs | grep -v true
>>> #
>>>
>>> Is there a command to see the statfs reported by an individual OSD ?
>>> We have a mix of ~year old and recently recreated OSDs, so I could try
>>> to see if they differ.
>>>
>>> Thanks!
>>>
>>> Dan
>>>
>>>
>>> On Thu, Nov 26, 2020 at 5:50 PM Igor Fedotov <ifedotov(a)suse.de> wrote:
>>>> Hi Dan
>>>>
>>>> don't you have BLUESTORE_LEGACY_STATFS alert raised (might be
silenced
>>>> by bluestore_warn_on_legacy_statfs param) for the older cluster?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Igor
>>>>
>>>>
>>>> On 11/26/2020 7:29 PM, Dan van der Ster wrote:
>>>>> Hi,
>>>>>
>>>>> Depending on which cluster I look at (all running v14.2.11), the
>>>>> bytes_used is reporting raw space or stored bytes variably.
>>>>>
>>>>> Here's a 7 year old cluster:
>>>>>
>>>>> # ceph df -f json | jq .pools[0]
>>>>> {
>>>>> "name": "volumes",
>>>>> "id": 4,
>>>>> "stats": {
>>>>> "stored": 1229308190855881,
>>>>> "objects": 294401604,
>>>>> "kb_used": 1200496280133,
>>>>> "bytes_used": 1229308190855881,
>>>>> "percent_used": 0.4401889145374298,
>>>>> "max_avail": 521125025021952
>>>>> }
>>>>> }
>>>>>
>>>>> Note that stored == bytes_used for that pool. (this is a 3x replica
pool).
>>>>>
>>>>> But here's a newer cluster (installed recently with nautilus)
>>>>>
>>>>> # ceph df -f json | jq .pools[0]
>>>>> {
>>>>> "name": "volumes",
>>>>> "id": 1,
>>>>> "stats": {
>>>>> "stored": 680977600893041,
>>>>> "objects": 163155803,
>>>>> "kb_used": 1995736271829,
>>>>> "bytes_used": 2043633942351985,
>>>>> "percent_used": 0.23379847407341003,
>>>>> "max_avail": 2232457428467712
>>>>> }
>>>>> }
>>>>>
>>>>> In the second cluster, bytes_used is 3x stored.
>>>>>
>>>>> Does anyone know why these are not reported consistently?
>>>>> Noticing this just now, I'll update our monitoring to plot
stored
>>>>> rather than bytes_used from now on.
>>>>>
>>>>> Thanks!
>>>>>
>>>>> Dan
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io