July 2020 - ceph-users - lists.ceph.io

by Dennis Benndorf

Hi Cephers, I have a lot of questions after reading a lot about bluestore, the min_alloc_size and the impact of writing small files to it through cepfs with erasure coding. In a setup using VMware ESX with its VMFS6 and its 1 MB block size on an ISCSI-LUN mapped from Ceph RBD (replicated) 1. Wouldnt it be better to reduce RBD block size to 1 MB also? 2. When a file with a size smaller than 4k is written to a filesystem within a virtual machine (and all the layers downwards) will the consumed space be 4k? So does the 4MB block size of RBD combine a lot of small files to one big object? When using Cephfs and erasure coding: 1. I assume using a 4k min_alloc_size_hdd would reduce wasted space, but increases fragmentation as Igor wrote. 2. How is the official way to deal with fragmentation in bluestore? Is there a defrag tool available or planned? From a performance perspective: My cluster runs on good old filestore using nvme journals. I am about to migrate to bluestore. 1. With a MaxIOSize of 512KB in VMware, wouldnt bluestore_prefer_deferred_size_hdd = 524288 give me a filestore like behavior? My aim is to have the write latency like in filestore because we have a lot of databases. 2. Are there any tradeoffs doing this? Regards, Dennis

3 years, 9 months

1
0
1 0

OSDs taking too much memory, for buffer_anon

by Harald Staub

As a follow-up to our recent memory problems with OSDs (with high pglog values: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/LJPJZPBSQRJ… ), we also see high buffer_anon values. E.g. more than 4 GB, with "osd memory target" set to 3 GB. Is there a way to restrict it? As it is called "anon", I guess that it would first be necessary to find out what exactly is behind this? Well maybe it is just as Wido said, with lots of small objects, there will be several problems. Cheers Harry

3 years, 9 months

4
8
0 0

Ceph Zabbix Monitoring : No such file or directory

by etiennemula＠gmail.com

Hello All, We are trying to make work the Zabbix module of our ceph cluster but im encountering an issue that got me stuck. Configuration of the module looks ok and we manage to send data using zabbix_sender to the host that is configured on Zabbix. We can also see this data/metric in the graphs that come with the template aswell. So far so good. However when testing from ceph module by doing ceph zabbix send , we get : Sending Data to Zabbix but when check ceph_mgr logs we see : 7fad8b61e700 0 mgr[zabbix] Exception when sending: [Errno 2] No such file or directory and no data is seen in zabbix itself. Any clue what we can check ? This ceph is running with an openstack cluster and ceph mgr instance is a docker container, not sure if we need to do anything in particular because of this architecture. Ceph Version : 12.2.4 Thank you, Etienne.

3 years, 9 months

4
4
0 0

Ceph Zabbix Monitoring : No such file or directory

by etiennemula＠gmail.com

Hello All, We are trying to make work the Zabbix module of our ceph cluster but im encountering an issue that got me stuck. Configuration of the module looks ok and we manage to send data using zabbix_sender to the host that is configured on Zabbix. We can also see this data/metric in the graphs that come with the template aswell. So far so good. However when testing from ceph module by doing ceph zabbix send , we get : Sending Data to Zabbix but when check ceph_mgr logs we see : 7fad8b61e700 0 mgr[zabbix] Exception when sending: [Errno 2] No such file or directory and no data is seen in zabbix itself. Any clue what we can check ? This ceph is running with an openstack cluster and ceph mgr instance is a docker container, not sure if we need to do anything in particular because of this architecture. Ceph Version : 12.2.4 Thank you, Etienne.

3 years, 9 months

3
2
0 0

Octopus: Recovery and backfilling causes OSDs to crash after upgrading from nautilus to octopus

by Wout van Heeswijk

Hi All, A customer of ours has upgraded the cluster from nautilus to octopus after experiencing issues with osds not being able to connect to each other, clients/mons/mgrs. The connectivity issues was related to the msgrV2 and require_osd_release setting not being set to nautilus. After fixing this the OSDs were restarted and all placement groups became active again. After unsetting the norecover and nobackfill flag some OSDs started crashing every few minutes. The OSD log, even with high debug settings, don't seem to reveal anything, it just stops logging mid log line. I've created a bug report: https://tracker.ceph.com/issues/46366 Has anyone experienced something similar? -- kind regards, Wout 42on

3 years, 9 months

4
7
0 0

Upgrade from 14.2.6 to 15.2.4

by ST Wong (ITSC)

Hi all, We're planning to upgrade an existing testing 14.2.6 CEPH cluster to latest 15.2.4. Existing cluster was deployed using ceph-ansible and we're still searching steps to do the upgrade. Shall we do the upgrade with steps like following ? - Upgrade rpm on all nodes (mon/mgr first, then osd, then rgw) - Convert the cluster to use cephadm as mentioned here https://docs.ceph.com/docs/master/cephadm/adoption/ Apology for the newbie question. Thanks a lot. Regards, /ST Wong

3 years, 9 months

1
0
0 0

Nautilus upgrade HEALTH_WARN legacy tunables

by jimf＠mninc.net

I just upgraded from Luminous to Nautilus. the cluster was originally hammer or kraken (can't recall), then jewel, on to luminous, and now Nautilus. The message i get from ceph -s is: "health: HEALTH_WARN crush map has legacy tunables (require firefly, min is hammer)" Not sure how to get rid of it. I tried "ceph osd set-require-min-compat-client firefly" per instructions but still have error. When I query osd: ceph osd dump | grep min_compat_client require_min_compat_client firefly min_compat_client firefly Instructions say I can ignore the error, but that seems like a bad idea. Windows VM's are running on qemu/kvm via rbd connection. Ceph and KVM are running CentOS7. Just upgraded all clients to "centos-release-7-8.2003.0.el7.centos.x86_64" I have rebooted all nodes since CentOS upgrade and again after Nautilus upgrade. tried setting "ceph osd set-require-min-compat-client jewel" but then heath_warn was still present and: ceph osd dump | grep min_compat_client require_min_compat_client jewel min_compat_client firefly Any help is appreciated.

3 years, 9 months

4
3
0 0

rgw print continue

by Seena Fallah

Hi all. I see this sentence in many sites. Does anyone knows why? > Then turn off print continue. If you have it set to true, you may encounter problems with PUT operations I use nginx in front of my rgw and proxy pass expect header in it. Thanks.

3 years, 9 months

1
0
0 0

Placement of block/db and WAL on SSD?

by Lindsay Mathieson

Nautilus install. Documentation seems a bit ambiguous to me - this is for a spinner + SSD, using ceph-volume If I put the block.db on the SSD with "ceph-volume lvm create --bluestore --data /dev/sdd --block.db /dev/sdc1" does the wal exists on the ssd (/dev/sdc1) as well, or does it remain on the hdd (/dev/sdd)? Conversely, what happens with the block.db if I place the wal with --block.wal Or do I have to setup separate partitions for the block.db and wal? -- Lindsay

3 years, 9 months

2
3
0 0

High iops on bucket index

by Seena Fallah

Hi all. There are high iops on my bucket index pool when there is about 1K PUT request/s. Is there any way I can debug why there are so many iops on the bucket index pool? Thanks.

3 years, 9 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users July 2020