[ceph-users] Re: Slow Write Issues

26 Sep 2019

HI Joao,

You can see how much RocksDB space has been used with this command “ceph daemon osd.X perf
dump” Where X is an OSD id on the node you are running the command on.

You are looking for this section in the output :-
    "bluefs": {
        "gift_bytes": 0,
        "reclaim_bytes": 0,
        "db_total_bytes": 23966253056,
        "db_used_bytes": 1714421760,
        "wal_total_bytes": 0,
        "wal_used_bytes": 0,
        "slow_total_bytes": 0,
        "slow_used_bytes": 0,
        "num_files": 24,
        "log_bytes": 552120320,
        "log_compactions": 0,
        "logged_bytes": 537051136,
        "files_written_wal": 1,
        "files_written_sst": 11,
        "bytes_written_wal": 429315193,
        "bytes_written_sst": 601384180,
        "bytes_written_slow": 0,
        "max_bytes_wal": 0,
        "max_bytes_db": 1714421760,
        "max_bytes_slow": 0
    },

If you have numbers in the slow_ entries then your RocksDB is spilling over onto the
HDD.

As to if moving RocksDb and WAL on HDD can cause a performance degradation then it depends
how busy your disks are. If you HDD’s are working hard and you are now going to throw a
lot more workload onto them then performance will degrade. Could be substantially. I have
seen performance impacts of upto 75% when things have started spilling over from NVME to
HDD.
By that I mean I had a lovely flat line ingesting objects and that line suddenly dropped
by 75% once the RocksDB had filled up and spilt over onto the HDD.

From: João Victor Rodrigues Soares &lt;jvrs2683(a)gmail.com&gt;
Date: Wednesday, 25 September 2019 at 14:37
To: &quot;ceph-users(a)ceph.io&quot; &lt;ceph-users(a)ceph.io&gt;
Subject: [ceph-users] Slow Write Issues

Hello,

In my company, we currently have the following infrastructure:

- Ceph Luminous
- OpenStack Pike.

We have a cluster of 3 osd nodes with the following configuration:

- 1 x Xeon (R) D-2146NT CPU @ 2.30GHz
- 128GB RAM
- 128GB ROOT DISK
- 12 x 10TB SATA ST10000NM0146 (OSD)
- 1 x Intel Optane P4800X SSD DC 375GB (block.DB / block.wal)
- Ubuntu 16.04
- 2 X 10Gb network interface configured with lacp

The compute nodes have
- 4 x 10Gb network interfaces with lacp.

We also have 4 monitors with:
- 4 x 10Gb lacp network interfaces.
- The monitor nodes are approx. 90% cpu idle time with 32GB / 256GB available RAM

For each OSD disk we have created a partition of 33GB to block.db and block.wal.

We are recently facing a number of performance issues. Virtual machines created in
OpenStack are experiencing slow writing issues (approx. 50MB / s).

The OSD nodes monitoring incur an average of 20% cpu IOwait time and 70 cpu idle time.
The memory consumption is around 30% consumption.
We have no latency issues (9ms average)

My question is if what is happening may have to do with the amount of disk dedicated to DB
/ WAL. In the CEPH documentation it says it is recommended that the block.db size is not
smaller than 4% of block.

In this case for each disk in my environment block.db could not be less than 400GB /
OSD.

Another question is if I set my disks to use block.db / block.wal on the mechanical disks
themselves, if that could lead to a performance degradation.

Att.
João Victor Rodrigues Soares

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Slow Write Issues