I did end up writing a unit test to see what we calculated here, as well as
adding a bunch of debug logging (haven't created a PR yet, but probably
will). The total memory was set to (19858056 * 1024 * 0.7) (total memory
in bytes * the autotune target ratio) = 14234254540. What ended up getting
logged was (ignore the daemon id for the daemons, they don't affect
anything. Only the types matter)
*DEBUG cephadm.autotune:autotune.py:35 Autotuning OSD memory with given
parameters:Total memory: 14234254540Daemons: [<DaemonDescription>(crash.a),
<DaemonDescription>(grafana.a), <DaemonDescription>(mds.a),
<DaemonDescription>(mds.b), <DaemonDescription>(mds.c),
<DaemonDescription>(mgr.a), <DaemonDescription>(mon.a),
<DaemonDescription>(node-exporter.a), <DaemonDescription>(osd.1),
<DaemonDescription>(osd.2), <DaemonDescription>(osd.3),
<DaemonDescription>(osd.4), <DaemonDescription>(prometheus.a)]DEBUG
cephadm.autotune:autotune.py:50 Subtracting 134217728 from total for crash
daemonDEBUG cephadm.autotune:autotune.py:52 new total: 14100036812DEBUG
cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for
grafana daemonDEBUG cephadm.autotune:autotune.py:52 new total:
13026294988DEBUG cephadm.autotune:autotune.py:40 Subtracting 17179869184
from total for mds daemonDEBUG cephadm.autotune:autotune.py:42 new
total: -4153574196DEBUG cephadm.autotune:autotune.py:40 Subtracting
17179869184 from total for mds daemonDEBUG
cephadm.autotune:autotune.py:42 new total: -21333443380DEBUG
cephadm.autotune:autotune.py:40 Subtracting 17179869184 from total for mds
daemonDEBUG cephadm.autotune:autotune.py:42 new total: -38513312564DEBUG
cephadm.autotune:autotune.py:50 Subtracting 4294967296 from total for
mgr daemonDEBUG cephadm.autotune:autotune.py:52 new total:
-42808279860DEBUG cephadm.autotune:autotune.py:50 Subtracting 1073741824
from total for mon daemonDEBUG cephadm.autotune:autotune.py:52 new
total: -43882021684DEBUG cephadm.autotune:autotune.py:50 Subtracting
1073741824 from total for node-exporter daemonDEBUG
cephadm.autotune:autotune.py:52 new total: -44955763508DEBUG
cephadm.autotune:autotune.py:50 Subtracting 1073741824 from total for
prometheus daemonDEBUG cephadm.autotune:autotune.py:52 new total:
-46029505332*
It looks like it was taking pretty much all the memory away for the mds
daemons. The amount, however, is taken from the "mds_cache_memory_limit"
setting for each mds daemon. The number it was defaulting to for the test
is quite large. I guess I'd need to know what that comes out to for the mds
daemons in your cluster to get a full picture. Also, you can see the total
go well into the negatives here. When that happens cephadm just tries to
remove the osd_memory_target config settings for the OSDs on the host, but
given the error message from your initial post, it must be getting some
positive value when actually running on your system.
On Fri, Apr 5, 2024 at 2:21 AM Mads Aasted <mads2a(a)gmail.com> wrote:
Hi Adam
No problem, i really appreciate your input :)
The memory stats returned are as follows
"memory_available_kb": 19858056,
"memory_free_kb": 277480,
"memory_total_kb": 32827840,
On Thu, Apr 4, 2024 at 10:14 PM Adam King <adking(a)redhat.com> wrote:
> Sorry to keep asking for more info, but can I also get what `cephadm
> gather-facts` on that host returns for "memory_total_kb". Might end up
> creating a unit test out of this case if we have a calculation bug here.
>
> On Thu, Apr 4, 2024 at 4:05 PM Mads Aasted <mads2a(a)gmail.com> wrote:
>
>> sorry for the double send, forgot to hit reply all so it would appear on
>> the page
>>
>> Hi Adam
>>
>> If we multiply by 0.7, and work through the previous example from that
>> number, we would still arrive at roughly 2.5 gb for each osd. And the host
>> in question is trying to set it to less than 500mb.
>> I have attached a list of the processes running on the host. Currently
>> you can even see that the OSD's are taking up the most memory by far, and
>> at least 5x its proposed minimum.
>> root@my-ceph01:/# ceph orch ps | grep my-ceph01
>> crash.my-ceph01 my-ceph01 running (3w)
>> 7m ago 13M 9052k - 17.2.6
>> grafana.my-ceph01 my-ceph01 *:3000 running (3w)
>> 7m ago 13M 95.6M - 8.3.5
>> mds.testfs.my-ceph01.xjxfzd my-ceph01 running (3w)
>> 7m ago 10M 485M - 17.2.6
>> mds.prodfs.my-ceph01.rplvac my-ceph01 running (3w)
>> 7m ago 12M 26.9M - 17.2.6
>> mds.prodfs.my-ceph01.twikzd my-ceph01 running (3w)
>> 7m ago 12M 26.2M - 17.2.6
>> mgr.my-ceph01.rxdefe my-ceph01 *:8443,9283 running (3w)
>> 7m ago 13M 907M - 17.2.6
>> mon.my-ceph01 my-ceph01 running (3w)
>> 7m ago 13M 503M 2048M 17.2.6
>> node-exporter.my-ceph01 my-ceph01 *:9100 running (3w)
>> 7m ago 13M 20.4M - 1.5.0
>> osd.3 my-ceph01 running (3w)
>> 7m ago 11M 2595M 4096M 17.2.6
>> osd.5 my-ceph01 running (3w)
>> 7m ago 11M 2494M 4096M 17.2.6
>> osd.6 my-ceph01 running (3w)
>> 7m ago 11M 2698M 4096M 17.2.6
>> osd.9 my-ceph01 running (3w)
>> 7m ago 11M 3364M 4096M 17.2.6
>> prometheus.my-ceph01 my-ceph01 *:9095 running (3w)
>> 7m ago 13M 164M - 2.42.0
>>
>>
>>
>>
>> On Thu, Mar 28, 2024 at 2:13 AM Adam King <adking(a)redhat.com> wrote:
>>
>>> I missed a step in the calculation. The total_memory_kb I mentioned
>>> earlier is also multiplied by the value of the
>>> mgr/cephadm/autotune_memory_target_ratio before doing the subtractions for
>>> all the daemons. That value defaults to 0.7. That might explain it seeming
>>> like it's getting a value lower than expected. Beyond that, I'd think
'i'd
>>> need a list of the daemon types and count on that host to try and work
>>> through what it's doing.
>>>
>>> On Wed, Mar 27, 2024 at 10:47 AM Mads Aasted <mads2a(a)gmail.com> wrote:
>>>
>>>> Hi Adam.
>>>>
>>>> So doing the calculations with what you are stating here I arrive at a
>>>> total sum for all the listed processes at 13.3 (roughly) gb, for
everything
>>>> except the osds, leaving well in excess of +4gb for each OSD.
>>>> Besides the mon daemon which i can tell on my host has a limit of 2gb
>>>> , none of the other daemons seem to have a limit set according to ceph
orch
>>>> ps. Then again, they are nowhere near the values stated in
min_size_by_type
>>>> that you list.
>>>> Obviously yes, I could disable the auto tuning, but that would leave
>>>> me none the wiser as to why this exact host is trying to do this.
>>>>
>>>>
>>>>
>>>> On Tue, Mar 26, 2024 at 10:20 PM Adam King <adking(a)redhat.com>
wrote:
>>>>
>>>>> For context, the value the autotune goes with takes the value from
>>>>> `cephadm gather-facts` on the host (the "memory_total_kb"
field) and then
>>>>> subtracts from that per daemon on the host according to
>>>>>
>>>>> min_size_by_type = {
>>>>> 'mds': 4096 * 1048576,
>>>>> 'mgr': 4096 * 1048576,
>>>>> 'mon': 1024 * 1048576,
>>>>> 'crash': 128 * 1048576,
>>>>> 'keepalived': 128 * 1048576,
>>>>> 'haproxy': 128 * 1048576,
>>>>> 'nvmeof': 4096 * 1048576,
>>>>> }
>>>>> default_size = 1024 * 1048576
>>>>>
>>>>> what's left is then divided by the number of OSDs on the host to
>>>>> arrive at the value. I'll also add, since it seems to be an issue
on this
>>>>> particular host, if you add the "_no_autotune_memory"
label to the host,
>>>>> it will stop trying to do this on that host.
>>>>>
>>>>> On Mon, Mar 25, 2024 at 6:32 PM <mads2a(a)gmail.com> wrote:
>>>>>
>>>>>> I have a virtual ceph cluster running 17.2.6 with 4 ubuntu 22.04
>>>>>> hosts in it, each with 4 OSD's attached. The first 2 servers
hosting mgr's
>>>>>> have 32GB of RAM each, and the remaining have 24gb
>>>>>> For some reason i am unable to identify, the first host in the
>>>>>> cluster appears to constantly be trying to set the
osd_memory_target
>>>>>> variable to roughly half of what the calculated minimum is for
the cluster,
>>>>>> i see the following spamming the logs constantly
>>>>>> Unable to set osd_memory_target on my-ceph01 to 480485376: error
>>>>>> parsing value: Value '480485376' is below minimum
939524096
>>>>>> Default is set to 4294967296.
>>>>>> I did double check and osd_memory_base (805306368) +
>>>>>> osd_memory_cache_min (134217728) adds up to minimum exactly
>>>>>> osd_memory_target_autotune is currently enabled. But i cannot
for
>>>>>> the life of me figure out how it is arriving at 480485376 as a
value for
>>>>>> that particular host that even has the most RAM. Neither the
cluster or the
>>>>>> host is even approaching max utilization on memory, so it's
not like there
>>>>>> are processes competing for resources.
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>>>
>>>>>>