Hi Stefan,
hmm... could you please collect performance counters for these two
cases. Using the following sequence
1) reset perf counters for the specific OSD
2) run bench
3) dump perf counters.
Collecting disks' (both main and db) activity with iostat would be nice
too. But please either increase benchmark duration or reduce iostat
probe period to 0.1 or 0.05 second
Thanks,
Igor
On 4/28/2020 8:42 PM, Stefan Priebe - Profihost AG wrote:
> HI Igor,
>
> but the performance issue is still present even on the recreated OSD.
>
> # ceph tell osd.38 bench -f plain 12288000 4096
> bench: wrote 12 MiB in blocks of 4 KiB in 1.63389 sec at 7.2 MiB/sec
> 1.84k IOPS
>
> vs.
>
> # ceph tell osd.10 bench -f plain 12288000 4096
> bench: wrote 12 MiB in blocks of 4 KiB in 10.7454 sec at 1.1 MiB/sec 279
> IOPS
>
> both baked by the same SAMSUNG SSD as block.db.
>
> Greets,
> Stefan
>
> Am 28.04.20 um 19:12 schrieb Stefan Priebe - Profihost AG:
>> Hi Igore,
>> Am 27.04.20 um 15:03 schrieb Igor Fedotov:
>>> Just left a comment at
https://tracker.ceph.com/issues/44509
>>>
>>> Generally bdev-new-db performs no migration, RocksDB might eventually do
>>> that but no guarantee it moves everything.
>>>
>>> One should use bluefs-bdev-migrate to do actual migration.
>>>
>>> And I think that's the root cause for the above ticket.
>> perfect - this removed all spillover in seconds.
>>
>> Greets,
>> Stefan
>>
>>
>>> Thanks,
>>>
>>> Igor
>>>
>>> On 4/24/2020 2:37 PM, Stefan Priebe - Profihost AG wrote:
>>>> No not a standalone Wal I wanted to ask whether bdev-new-db migrated
>>>> dB and Wal from hdd to ssd.
>>>>
>>>> Stefan
>>>>
>>>>> Am 24.04.2020 um 13:01 schrieb Igor Fedotov
<ifedotov(a)suse.de>de>:
>>>>>
>>>>>
>>>>>
>>>>> Unless you have 3 different types of disks beyond OSD (e.g. HDD,
SSD,
>>>>> NVMe) standalone WAL makes no sense.
>>>>>
>>>>>
>>>>> On 4/24/2020 1:58 PM, Stefan Priebe - Profihost AG wrote:
>>>>>> Is Wal device missing? Do I need to run *bluefs-bdev-new-db and
Wal?*
>>>>>>
>>>>>> Greets,
>>>>>> Stefan
>>>>>>
>>>>>>> Am 24.04.2020 um 11:32 schrieb Stefan Priebe - Profihost AG
>>>>>>> <s.priebe(a)profihost.ag>ag>:
>>>>>>>
>>>>>>> Hi Igor,
>>>>>>>
>>>>>>> there must be a difference. I purged osd.0 and recreated it.
>>>>>>>
>>>>>>> Now it gives:
>>>>>>> ceph tell osd.0 bench
>>>>>>> {
>>>>>>> "bytes_written": 1073741824,
>>>>>>> "blocksize": 4194304,
>>>>>>> "elapsed_sec": 8.1554735639999993,
>>>>>>> "bytes_per_sec": 131659040.46819863,
>>>>>>> "iops": 31.389961354303033
>>>>>>> }
>>>>>>>
>>>>>>> What's wrong wiht adding a block.db device later?
>>>>>>>
>>>>>>> Stefan
>>>>>>>
>>>>>>> Am 23.04.20 um 20:34 schrieb Stefan Priebe - Profihost AG:
>>>>>>>> Hi,
>>>>>>>> if the OSDs are idle the difference is even more worse:
>>>>>>>> # ceph tell osd.0 bench
>>>>>>>> {
>>>>>>>> "bytes_written": 1073741824,
>>>>>>>> "blocksize": 4194304,
>>>>>>>> "elapsed_sec": 15.396707875000001,
>>>>>>>> "bytes_per_sec": 69738403.346825853,
>>>>>>>> "iops": 16.626931034761871
>>>>>>>> }
>>>>>>>> # ceph tell osd.38 bench
>>>>>>>> {
>>>>>>>> "bytes_written": 1073741824,
>>>>>>>> "blocksize": 4194304,
>>>>>>>> "elapsed_sec": 6.8903985170000004,
>>>>>>>> "bytes_per_sec": 155831599.77624846,
>>>>>>>> "iops": 37.153148597776521
>>>>>>>> }
>>>>>>>> Stefan
>>>>>>>> Am 23.04.20 um 14:39 schrieb Stefan Priebe - Profihost
AG:
>>>>>>>>> Hi,
>>>>>>>>> Am 23.04.20 um 14:06 schrieb Igor Fedotov:
>>>>>>>>>> I don't recall any additional tuning to be
applied to new DB
>>>>>>>>>> volume. And assume the hardware is pretty the
same...
>>>>>>>>>>
>>>>>>>>>> Do you still have any significant amount of data
spilled over
>>>>>>>>>> for these updated OSDs? If not I don't have
any valid
>>>>>>>>>> explanation for the phenomena.
>>>>>>>>> just the 64k from here:
>>>>>>>>>
https://tracker.ceph.com/issues/44509
>>>>>>>>>
>>>>>>>>>> You might want to try "ceph osd bench"
to compare OSDs under
>>>>>>>>>> pretty the same load. Any difference observed
>>>>>>>>> Servers are the same HW. OSD Bench is:
>>>>>>>>> # ceph tell osd.0 bench
>>>>>>>>> {
>>>>>>>>> "bytes_written": 1073741824,
>>>>>>>>> "blocksize": 4194304,
>>>>>>>>> "elapsed_sec": 16.091414781000001,
>>>>>>>>> "bytes_per_sec": 66727620.822242722,
>>>>>>>>> "iops": 15.909104543266945
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> # ceph tell osd.36 bench
>>>>>>>>> {
>>>>>>>>> "bytes_written": 1073741824,
>>>>>>>>> "blocksize": 4194304,
>>>>>>>>> "elapsed_sec": 10.023828538,
>>>>>>>>> "bytes_per_sec": 107118933.6419194,
>>>>>>>>> "iops": 25.539143953780986
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> OSD 0 is a Toshiba MG07SCA12TA SAS 12G
>>>>>>>>> OSD 36 is a Seagate ST12000NM0008-2H SATA 6G
>>>>>>>>>
>>>>>>>>> SSDs are all the same like the rest of the HW. But
both drives
>>>>>>>>> should give the same performance from their specs.
The only other
>>>>>>>>> difference is that OSD 36 was directly created with
the block.db
>>>>>>>>> device (Nautilus 14.2.7) and OSD 0 (14.2.8) does
not.
>>>>>>>>>
>>>>>>>>> Stefan
>>>>>>>>>
>>>>>>>>>> On 4/23/2020 8:35 AM, Stefan Priebe - Profihost
AG wrote:
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> is there anything else needed beside
running:
>>>>>>>>>>> ceph-bluestore-tool --path
/var/lib/ceph/osd/ceph-${OSD}
>>>>>>>>>>> bluefs-bdev-new-db --dev-target
/dev/vgroup/lvdb-1
>>>>>>>>>>>
>>>>>>>>>>> I did so some weeks ago and currently i'm
seeing that all osds
>>>>>>>>>>> originally deployed with --block-db show
10-20% I/O waits while
>>>>>>>>>>> all those got converted using
ceph-bluestore-tool show 80-100%
>>>>>>>>>>> I/O waits.
>>>>>>>>>>>
>>>>>>>>>>> Also is there some tuning available to use
more of the SSD? The
>>>>>>>>>>> SSD (block-db) is only saturated at 0-2%.
>>>>>>>>>>>
>>>>>>>>>>> Greets,
>>>>>>>>>>> Stefan
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> ceph-users mailing list --
ceph-users(a)ceph.io
>>>>>>>>>>> To unsubscribe send an email to
ceph-users-leave(a)ceph.io