For context, the value the autotune goes with takes the value from `cephadm
gather-facts` on the host (the "memory_total_kb" field) and then subtracts
from that per daemon on the host according to
min_size_by_type = {
'mds': 4096 * 1048576,
'mgr': 4096 * 1048576,
'mon': 1024 * 1048576,
'crash': 128 * 1048576,
'keepalived': 128 * 1048576,
'haproxy': 128 * 1048576,
'nvmeof': 4096 * 1048576,
}
default_size = 1024 * 1048576
what's left is then divided by the number of OSDs on the host to arrive at
the value. I'll also add, since it seems to be an issue on this particular
host, if you add the "_no_autotune_memory" label to the host, it will stop
trying to do this on that host.
On Mon, Mar 25, 2024 at 6:32 PM <mads2a(a)gmail.com> wrote:
I have a virtual ceph cluster running 17.2.6 with 4
ubuntu 22.04 hosts in
it, each with 4 OSD's attached. The first 2 servers hosting mgr's have 32GB
of RAM each, and the remaining have 24gb
For some reason i am unable to identify, the first host in the cluster
appears to constantly be trying to set the osd_memory_target variable to
roughly half of what the calculated minimum is for the cluster, i see the
following spamming the logs constantly
Unable to set osd_memory_target on my-ceph01 to 480485376: error parsing
value: Value '480485376' is below minimum 939524096
Default is set to 4294967296.
I did double check and osd_memory_base (805306368) + osd_memory_cache_min
(134217728) adds up to minimum exactly
osd_memory_target_autotune is currently enabled. But i cannot for the life
of me figure out how it is arriving at 480485376 as a value for that
particular host that even has the most RAM. Neither the cluster or the host
is even approaching max utilization on memory, so it's not like there are
processes competing for resources.
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io