The same experiment with the mds daemons pulling 4GB instead of the 16GB,
and me fixing the starting total memory (I accidentally used the
memory_available_kb instead of memory_total_kb the first time) gives us
*DEBUG cephadm.autotune:autotune.py:35 Autotuning OSD memory with given
parameters:Total memory: 23530995712Daemons: [<DaemonDescription>(crash.a),
<DaemonDescription>(grafana.a), <DaemonDescription>(mds.a),
<DaemonDescription>(mds.b), <DaemonDescription>(mds.c),
<DaemonDescription>(mgr.a), <DaemonDescription>(mon.a),
<DaemonDescription>(node-exporter.a), <DaemonDescription>(osd.1),
<DaemonDescription>(osd.2), <DaemonDescription>(osd.3),
<DaemonDescription>(osd.4), <DaemonDescription>(prometheus.a)]DEBUG
cephadm.autotune:autotune.py:50 Subtracting 134217728 from total for crash
daemonDEBUG cephadm.autotune:autotune.py:52 new total: 23396777984DEBUG
cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for
grafana daemonDEBUG cephadm.autotune:autotune.py:52 new total:
22323036160DEBUG cephadm.autotune:autotune.py:40 Subtracting 4294967296
from total for mds daemonDEBUG cephadm.autotune:autotune.py:42 new
total: 18028068864DEBUG cephadm.autotune:autotune.py:40 Subtracting
4294967296 from total for mds daemonDEBUG
cephadm.autotune:autotune.py:42 new total: 13733101568DEBUG
cephadm.autotune:autotune.py:40 Subtracting 4294967296 from total for mds
daemonDEBUG cephadm.autotune:autotune.py:42 new total: 9438134272DEBUG
cephadm.autotune:autotune.py:50 Subtracting 4294967296 from total for mgr
daemonDEBUG cephadm.autotune:autotune.py:52 new total: 5143166976DEBUG
cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for mon
daemonDEBUG cephadm.autotune:autotune.py:52 new total: 4069425152DEBUG
cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for
node-exporter daemonDEBUG cephadm.autotune:autotune.py:52 new total:
2995683328DEBUG cephadm.autotune:autotune.py:50 Subtracting 1073741824
from total for prometheus daemonDEBUG cephadm.autotune:autotune.py:52
new total: 1921941504DEBUG cephadm.autotune:autotune.py:66 Final total
is 1921941504 to be split among 4 OSDsDEBUG
cephadm.autotune:autotune.py:68 Result is 480485376 per OSD*
My understanding is, given starting memory_total_kb of *32827840*, we get
*33615708160* total bytes. We multiply that by the 0.7 autotune ratio to
get *23530995712 *bytes to be split among the daemons (something like 23-24
GB). Then the mgr and mds daemons all get 4GB, the mon, node-exporter, and
prometheus all take 1GB, and the crash daemon gets 128KB. That leaves us
with only 2GB to split among the 4 OSDs. That's how we arrive at that
"480485376" number per OSD from the original error message you posted.
Unable to set osd_memory_target on my-ceph01 to 480485376: error parsing
value: Value '480485376' is below minimum
939524096
As that value is well below the minimum (it's only about half a GB), it
reports that error when trying to set it.
On Tue, Apr 9, 2024 at 12:58 PM Mads Aasted <mads2a(a)gmail.com> wrote:
Hi Adam
Seems like the mds_cache_memory_limit both set globally through cephadm
and the hosts mds daemons are all set to approx. 4gb
root@my-ceph01:/# ceph config get mds mds_cache_memory_limit
4294967296
same if query the individual mds daemons running on my-ceph01, or any of
the other mds daemons on the other hosts.
On Tue, Apr 9, 2024 at 6:14 PM Mads Aasted <mads2a(a)gmail.com> wrote:
> Hi Adam
>
> Let me just finish tucking in a devlish tyke here and i’ll get to it
> first thing
>
> tirs. 9. apr. 2024 kl. 18.09 skrev Adam King <adking(a)redhat.com>om>:
>
>> I did end up writing a unit test to see what we calculated here, as well
>> as adding a bunch of debug logging (haven't created a PR yet, but probably
>> will). The total memory was set to (19858056 * 1024 * 0.7) (total memory
>> in bytes * the autotune target ratio) = 14234254540. What ended up getting
>> logged was (ignore the daemon id for the daemons, they don't affect
>> anything. Only the types matter)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *DEBUG cephadm.autotune:autotune.py:35 Autotuning OSD memory with
>> given parameters:Total memory: 14234254540Daemons:
>> [<DaemonDescription>(crash.a), <DaemonDescription>(grafana.a),
>> <DaemonDescription>(mds.a), <DaemonDescription>(mds.b),
>> <DaemonDescription>(mds.c), <DaemonDescription>(mgr.a),
>> <DaemonDescription>(mon.a), <DaemonDescription>(node-exporter.a),
>> <DaemonDescription>(osd.1), <DaemonDescription>(osd.2),
>> <DaemonDescription>(osd.3), <DaemonDescription>(osd.4),
>> <DaemonDescription>(prometheus.a)]DEBUG cephadm.autotune:autotune.py:50
>> Subtracting 134217728 from total for crash daemonDEBUG
>> cephadm.autotune:autotune.py:52 new total: 14100036812DEBUG
>> cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for
>> grafana daemonDEBUG cephadm.autotune:autotune.py:52 new total:
>> 13026294988DEBUG cephadm.autotune:autotune.py:40 Subtracting 17179869184
>> from total for mds daemonDEBUG cephadm.autotune:autotune.py:42 new
>> total: -4153574196DEBUG cephadm.autotune:autotune.py:40 Subtracting
>> 17179869184 from total for mds daemonDEBUG
>> cephadm.autotune:autotune.py:42 new total: -21333443380DEBUG
>> cephadm.autotune:autotune.py:40 Subtracting 17179869184 from total for mds
>> daemonDEBUG cephadm.autotune:autotune.py:42 new total: -38513312564DEBUG
>> cephadm.autotune:autotune.py:50 Subtracting 4294967296 from total for
>> mgr daemonDEBUG cephadm.autotune:autotune.py:52 new total:
>> -42808279860DEBUG cephadm.autotune:autotune.py:50 Subtracting 1073741824
>> from total for mon daemonDEBUG cephadm.autotune:autotune.py:52 new
>> total: -43882021684DEBUG cephadm.autotune:autotune.py:50 Subtracting
>> 1073741824 from total for node-exporter daemonDEBUG
>> cephadm.autotune:autotune.py:52 new total: -44955763508DEBUG
>> cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for
>> prometheus daemonDEBUG cephadm.autotune:autotune.py:52 new total:
>> -46029505332*
>>
>> It looks like it was taking pretty much all the memory away for the mds
>> daemons. The amount, however, is taken from the
"mds_cache_memory_limit"
>> setting for each mds daemon. The number it was defaulting to for the test
>> is quite large. I guess I'd need to know what that comes out to for the mds
>> daemons in your cluster to get a full picture. Also, you can see the total
>> go well into the negatives here. When that happens cephadm just tries to
>> remove the osd_memory_target config settings for the OSDs on the host, but
>> given the error message from your initial post, it must be getting some
>> positive value when actually running on your system.
>>
>> On Fri, Apr 5, 2024 at 2:21 AM Mads Aasted <mads2a(a)gmail.com> wrote:
>>
>>> Hi Adam
>>> No problem, i really appreciate your input :)
>>> The memory stats returned are as follows
>>> "memory_available_kb": 19858056,
>>> "memory_free_kb": 277480,
>>> "memory_total_kb": 32827840,
>>>
>>> On Thu, Apr 4, 2024 at 10:14 PM Adam King <adking(a)redhat.com> wrote:
>>>
>>>> Sorry to keep asking for more info, but can I also get what `cephadm
>>>> gather-facts` on that host returns for "memory_total_kb". Might
end up
>>>> creating a unit test out of this case if we have a calculation bug here.
>>>>
>>>> On Thu, Apr 4, 2024 at 4:05 PM Mads Aasted <mads2a(a)gmail.com>
wrote:
>>>>
>>>>> sorry for the double send, forgot to hit reply all so it would
appear
>>>>> on the page
>>>>>
>>>>> Hi Adam
>>>>>
>>>>> If we multiply by 0.7, and work through the previous example from
>>>>> that number, we would still arrive at roughly 2.5 gb for each osd.
And the
>>>>> host in question is trying to set it to less than 500mb.
>>>>> I have attached a list of the processes running on the host.
>>>>> Currently you can even see that the OSD's are taking up the most
memory by
>>>>> far, and at least 5x its proposed minimum.
>>>>> root@my-ceph01:/# ceph orch ps | grep my-ceph01
>>>>> crash.my-ceph01 my-ceph01 running (3w)
>>>>> 7m ago 13M 9052k - 17.2.6
>>>>> grafana.my-ceph01 my-ceph01 *:3000 running (3w)
>>>>> 7m ago 13M 95.6M - 8.3.5
>>>>> mds.testfs.my-ceph01.xjxfzd my-ceph01 running (3w)
>>>>> 7m ago 10M 485M - 17.2.6
>>>>> mds.prodfs.my-ceph01.rplvac my-ceph01 running (3w)
>>>>> 7m ago 12M 26.9M - 17.2.6
>>>>> mds.prodfs.my-ceph01.twikzd my-ceph01 running (3w)
>>>>> 7m ago 12M 26.2M - 17.2.6
>>>>> mgr.my-ceph01.rxdefe my-ceph01 *:8443,9283 running (3w)
>>>>> 7m ago 13M 907M - 17.2.6
>>>>> mon.my-ceph01 my-ceph01 running (3w)
>>>>> 7m ago 13M 503M 2048M 17.2.6
>>>>> node-exporter.my-ceph01 my-ceph01 *:9100 running (3w)
>>>>> 7m ago 13M 20.4M - 1.5.0
>>>>> osd.3 my-ceph01 running
(3w)
>>>>> 7m ago 11M 2595M 4096M 17.2.6
>>>>> osd.5 my-ceph01 running
(3w)
>>>>> 7m ago 11M 2494M 4096M 17.2.6
>>>>> osd.6 my-ceph01 running
(3w)
>>>>> 7m ago 11M 2698M 4096M 17.2.6
>>>>> osd.9 my-ceph01 running
(3w)
>>>>> 7m ago 11M 3364M 4096M 17.2.6
>>>>> prometheus.my-ceph01 my-ceph01 *:9095 running (3w)
>>>>> 7m ago 13M 164M - 2.42.0
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 28, 2024 at 2:13 AM Adam King <adking(a)redhat.com>
wrote:
>>>>>
>>>>>> I missed a step in the calculation. The total_memory_kb I
mentioned
>>>>>> earlier is also multiplied by the value of the
>>>>>> mgr/cephadm/autotune_memory_target_ratio before doing the
subtractions for
>>>>>> all the daemons. That value defaults to 0.7. That might explain
it seeming
>>>>>> like it's getting a value lower than expected. Beyond that,
I'd think 'i'd
>>>>>> need a list of the daemon types and count on that host to try and
work
>>>>>> through what it's doing.
>>>>>>
>>>>>> On Wed, Mar 27, 2024 at 10:47 AM Mads Aasted
<mads2a(a)gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Adam.
>>>>>>>
>>>>>>> So doing the calculations with what you are stating here I
arrive
>>>>>>> at a total sum for all the listed processes at 13.3 (roughly)
gb, for
>>>>>>> everything except the osds, leaving well in excess of +4gb
for each OSD.
>>>>>>> Besides the mon daemon which i can tell on my host has a
limit of
>>>>>>> 2gb , none of the other daemons seem to have a limit set
according to ceph
>>>>>>> orch ps. Then again, they are nowhere near the values stated
in
>>>>>>> min_size_by_type that you list.
>>>>>>> Obviously yes, I could disable the auto tuning, but that
would
>>>>>>> leave me none the wiser as to why this exact host is trying
to do this.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Mar 26, 2024 at 10:20 PM Adam King
<adking(a)redhat.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> For context, the value the autotune goes with takes the
value from
>>>>>>>> `cephadm gather-facts` on the host (the
"memory_total_kb" field) and then
>>>>>>>> subtracts from that per daemon on the host according to
>>>>>>>>
>>>>>>>> min_size_by_type = {
>>>>>>>> 'mds': 4096 * 1048576,
>>>>>>>> 'mgr': 4096 * 1048576,
>>>>>>>> 'mon': 1024 * 1048576,
>>>>>>>> 'crash': 128 * 1048576,
>>>>>>>> 'keepalived': 128 * 1048576,
>>>>>>>> 'haproxy': 128 * 1048576,
>>>>>>>> 'nvmeof': 4096 * 1048576,
>>>>>>>> }
>>>>>>>> default_size = 1024 * 1048576
>>>>>>>>
>>>>>>>> what's left is then divided by the number of OSDs on
the host to
>>>>>>>> arrive at the value. I'll also add, since it seems to
be an issue on this
>>>>>>>> particular host, if you add the
"_no_autotune_memory" label to the host,
>>>>>>>> it will stop trying to do this on that host.
>>>>>>>>
>>>>>>>> On Mon, Mar 25, 2024 at 6:32 PM <mads2a(a)gmail.com>
wrote:
>>>>>>>>
>>>>>>>>> I have a virtual ceph cluster running 17.2.6 with 4
ubuntu 22.04
>>>>>>>>> hosts in it, each with 4 OSD's attached. The
first 2 servers hosting mgr's
>>>>>>>>> have 32GB of RAM each, and the remaining have 24gb
>>>>>>>>> For some reason i am unable to identify, the first
host in the
>>>>>>>>> cluster appears to constantly be trying to set the
osd_memory_target
>>>>>>>>> variable to roughly half of what the calculated
minimum is for the cluster,
>>>>>>>>> i see the following spamming the logs constantly
>>>>>>>>> Unable to set osd_memory_target on my-ceph01 to
480485376: error
>>>>>>>>> parsing value: Value '480485376' is below
minimum 939524096
>>>>>>>>> Default is set to 4294967296.
>>>>>>>>> I did double check and osd_memory_base (805306368) +
>>>>>>>>> osd_memory_cache_min (134217728) adds up to minimum
exactly
>>>>>>>>> osd_memory_target_autotune is currently enabled. But
i cannot for
>>>>>>>>> the life of me figure out how it is arriving at
480485376 as a value for
>>>>>>>>> that particular host that even has the most RAM.
Neither the cluster or the
>>>>>>>>> host is even approaching max utilization on memory,
so it's not like there
>>>>>>>>> are processes competing for resources.
>>>>>>>>> _______________________________________________
>>>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io
>>>>>>>>>
>>>>>>>>>