HI Igor,
but the performance issue is still present even on the recreated OSD.
# ceph tell osd.38 bench -f plain 12288000 4096
bench: wrote 12 MiB in blocks of 4 KiB in 1.63389 sec at 7.2 MiB/sec
1.84k IOPS
vs.
# ceph tell osd.10 bench -f plain 12288000 4096
bench: wrote 12 MiB in blocks of 4 KiB in 10.7454 sec at 1.1 MiB/sec 279
IOPS
both baked by the same SAMSUNG SSD as block.db.
Greets,
Stefan
Am 28.04.20 um 19:12 schrieb Stefan Priebe - Profihost AG:
> Hi Igore,
> Am 27.04.20 um 15:03 schrieb Igor Fedotov:
>> Just left a comment at
https://tracker.ceph.com/issues/44509
>>
>> Generally bdev-new-db performs no migration, RocksDB might eventually do
>> that but no guarantee it moves everything.
>>
>> One should use bluefs-bdev-migrate to do actual migration.
>>
>> And I think that's the root cause for the above ticket.
>
> perfect - this removed all spillover in seconds.
>
> Greets,
> Stefan
>
>
>> Thanks,
>>
>> Igor
>>
>> On 4/24/2020 2:37 PM, Stefan Priebe - Profihost AG wrote:
>>> No not a standalone Wal I wanted to ask whether bdev-new-db migrated
>>> dB and Wal from hdd to ssd.
>>>
>>> Stefan
>>>
>>>> Am 24.04.2020 um 13:01 schrieb Igor Fedotov <ifedotov(a)suse.de>de>:
>>>>
>>>>
>>>>
>>>> Unless you have 3 different types of disks beyond OSD (e.g. HDD, SSD,
>>>> NVMe) standalone WAL makes no sense.
>>>>
>>>>
>>>> On 4/24/2020 1:58 PM, Stefan Priebe - Profihost AG wrote:
>>>>> Is Wal device missing? Do I need to run *bluefs-bdev-new-db and
Wal?*
>>>>>
>>>>> Greets,
>>>>> Stefan
>>>>>
>>>>>> Am 24.04.2020 um 11:32 schrieb Stefan Priebe - Profihost AG
>>>>>> <s.priebe(a)profihost.ag>ag>:
>>>>>>
>>>>>> Hi Igor,
>>>>>>
>>>>>> there must be a difference. I purged osd.0 and recreated it.
>>>>>>
>>>>>> Now it gives:
>>>>>> ceph tell osd.0 bench
>>>>>> {
>>>>>> "bytes_written": 1073741824,
>>>>>> "blocksize": 4194304,
>>>>>> "elapsed_sec": 8.1554735639999993,
>>>>>> "bytes_per_sec": 131659040.46819863,
>>>>>> "iops": 31.389961354303033
>>>>>> }
>>>>>>
>>>>>> What's wrong wiht adding a block.db device later?
>>>>>>
>>>>>> Stefan
>>>>>>
>>>>>> Am 23.04.20 um 20:34 schrieb Stefan Priebe - Profihost AG:
>>>>>>> Hi,
>>>>>>> if the OSDs are idle the difference is even more worse:
>>>>>>> # ceph tell osd.0 bench
>>>>>>> {
>>>>>>> "bytes_written": 1073741824,
>>>>>>> "blocksize": 4194304,
>>>>>>> "elapsed_sec": 15.396707875000001,
>>>>>>> "bytes_per_sec": 69738403.346825853,
>>>>>>> "iops": 16.626931034761871
>>>>>>> }
>>>>>>> # ceph tell osd.38 bench
>>>>>>> {
>>>>>>> "bytes_written": 1073741824,
>>>>>>> "blocksize": 4194304,
>>>>>>> "elapsed_sec": 6.8903985170000004,
>>>>>>> "bytes_per_sec": 155831599.77624846,
>>>>>>> "iops": 37.153148597776521
>>>>>>> }
>>>>>>> Stefan
>>>>>>> Am 23.04.20 um 14:39 schrieb Stefan Priebe - Profihost AG:
>>>>>>>> Hi,
>>>>>>>> Am 23.04.20 um 14:06 schrieb Igor Fedotov:
>>>>>>>>> I don't recall any additional tuning to be
applied to new DB
>>>>>>>>> volume. And assume the hardware is pretty the
same...
>>>>>>>>>
>>>>>>>>> Do you still have any significant amount of data
spilled over
>>>>>>>>> for these updated OSDs? If not I don't have any
valid
>>>>>>>>> explanation for the phenomena.
>>>>>>>>
>>>>>>>> just the 64k from here:
>>>>>>>>
https://tracker.ceph.com/issues/44509
>>>>>>>>
>>>>>>>>> You might want to try "ceph osd bench" to
compare OSDs under
>>>>>>>>> pretty the same load. Any difference observed
>>>>>>>>
>>>>>>>> Servers are the same HW. OSD Bench is:
>>>>>>>> # ceph tell osd.0 bench
>>>>>>>> {
>>>>>>>> "bytes_written": 1073741824,
>>>>>>>> "blocksize": 4194304,
>>>>>>>> "elapsed_sec": 16.091414781000001,
>>>>>>>> "bytes_per_sec": 66727620.822242722,
>>>>>>>> "iops": 15.909104543266945
>>>>>>>> }
>>>>>>>>
>>>>>>>> # ceph tell osd.36 bench
>>>>>>>> {
>>>>>>>> "bytes_written": 1073741824,
>>>>>>>> "blocksize": 4194304,
>>>>>>>> "elapsed_sec": 10.023828538,
>>>>>>>> "bytes_per_sec": 107118933.6419194,
>>>>>>>> "iops": 25.539143953780986
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>> OSD 0 is a Toshiba MG07SCA12TA SAS 12G
>>>>>>>> OSD 36 is a Seagate ST12000NM0008-2H SATA 6G
>>>>>>>>
>>>>>>>> SSDs are all the same like the rest of the HW. But both
drives
>>>>>>>> should give the same performance from their specs. The
only other
>>>>>>>> difference is that OSD 36 was directly created with the
block.db
>>>>>>>> device (Nautilus 14.2.7) and OSD 0 (14.2.8) does not.
>>>>>>>>
>>>>>>>> Stefan
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 4/23/2020 8:35 AM, Stefan Priebe - Profihost AG
wrote:
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> is there anything else needed beside running:
>>>>>>>>>> ceph-bluestore-tool --path
/var/lib/ceph/osd/ceph-${OSD}
>>>>>>>>>> bluefs-bdev-new-db --dev-target
/dev/vgroup/lvdb-1
>>>>>>>>>>
>>>>>>>>>> I did so some weeks ago and currently i'm
seeing that all osds
>>>>>>>>>> originally deployed with --block-db show 10-20%
I/O waits while
>>>>>>>>>> all those got converted using ceph-bluestore-tool
show 80-100%
>>>>>>>>>> I/O waits.
>>>>>>>>>>
>>>>>>>>>> Also is there some tuning available to use more
of the SSD? The
>>>>>>>>>> SSD (block-db) is only saturated at 0-2%.
>>>>>>>>>>
>>>>>>>>>> Greets,
>>>>>>>>>> Stefan
>>>>>>>>>> _______________________________________________
>>>>>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io