February 2020 - ceph-users

ERROR: osd init failed: (1) Operation not permitted

by Ml Ml

Hello List, first of all: Yes - i made mistakes. Now i am trying to recover :-/ I had a healthy 3 node cluster which i wanted to convert to a single one. My goal was to reinstall a fresh 3 Node cluster and start with 2 nodes. I was able to healthy turn it from a 3 Node Cluster to a 2 Node cluster. Then the problems began. I started to change size=1 and min_size=1. (i know, i know, i will never ever to that again!) Health was okay until here. Then over sudden both nodes got fenced...one node refused to boot, mons where missing, etc...to make long story short, here is where i am right now: root@node03:~ # ceph -s cluster b3be313f-d0ef-42d5-80c8-6b41380a47e3 health HEALTH_WARN 53 pgs stale 53 pgs stuck stale monmap e4: 2 mons at {0=10.15.15.3:6789/0,1=10.15.15.2:6789/0} election epoch 298, quorum 0,1 1,0 osdmap e6097: 14 osds: 9 up, 9 in pgmap v93644673: 512 pgs, 1 pools, 1193 GB data, 304 kobjects 1088 GB used, 32277 GB / 33366 GB avail 459 active+clean 53 stale+active+clean root@node03:~ # ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 32.56990 root default -2 25.35992 host node03 0 3.57999 osd.0 up 1.00000 1.00000 5 3.62999 osd.5 up 1.00000 1.00000 6 3.62999 osd.6 up 1.00000 1.00000 7 3.62999 osd.7 up 1.00000 1.00000 8 3.62999 osd.8 up 1.00000 1.00000 19 3.62999 osd.19 up 1.00000 1.00000 20 3.62999 osd.20 up 1.00000 1.00000 -3 7.20998 host node02 3 3.62999 osd.3 up 1.00000 1.00000 4 3.57999 osd.4 up 1.00000 1.00000 1 0 osd.1 down 0 1.00000 9 0 osd.9 down 0 1.00000 10 0 osd.10 down 0 1.00000 17 0 osd.17 down 0 1.00000 18 0 osd.18 down 0 1.00000 my main mistakes seemd to be: -------------------------------- ceph osd out osd.1 ceph auth del osd.1 systemctl stop ceph-osd@1 ceph osd rm 1 umount /var/lib/ceph/osd/ceph-1 ceph osd crush remove osd.1 As far as i can tell, ceph waits and needs data from that OSD.1 (which i removed) root@node03:~ # ceph health detail HEALTH_WARN 53 pgs stale; 53 pgs stuck stale pg 0.1a6 is stuck stale for 5086.552795, current state stale+active+clean, last acting [1] pg 0.142 is stuck stale for 5086.552784, current state stale+active+clean, last acting [1] pg 0.1e is stuck stale for 5086.552820, current state stale+active+clean, last acting [1] pg 0.e0 is stuck stale for 5086.552855, current state stale+active+clean, last acting [1] pg 0.1d is stuck stale for 5086.552822, current state stale+active+clean, last acting [1] pg 0.13c is stuck stale for 5086.552791, current state stale+active+clean, last acting [1] [...] SNIP [...] pg 0.e9 is stuck stale for 5086.552955, current state stale+active+clean, last acting [1] pg 0.87 is stuck stale for 5086.552939, current state stale+active+clean, last acting [1] When i try to start ODS.1 manually, i get: -------------------------------------------- 2020-02-10 18:48:26.107444 7f9ce31dd880 0 ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af), process ceph-osd, pid 10210 2020-02-10 18:48:26.134417 7f9ce31dd880 0 filestore(/var/lib/ceph/osd/ceph-1) backend xfs (magic 0x58465342) 2020-02-10 18:48:26.184202 7f9ce31dd880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is supported and appears to work 2020-02-10 18:48:26.184209 7f9ce31dd880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2020-02-10 18:48:26.184526 7f9ce31dd880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2020-02-10 18:48:26.184585 7f9ce31dd880 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: extsize is disabled by conf 2020-02-10 18:48:26.309755 7f9ce31dd880 0 filestore(/var/lib/ceph/osd/ceph-1) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2020-02-10 18:48:26.633926 7f9ce31dd880 1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2020-02-10 18:48:26.642185 7f9ce31dd880 1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2020-02-10 18:48:26.664273 7f9ce31dd880 0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello 2020-02-10 18:48:26.732154 7f9ce31dd880 0 osd.1 6002 crush map has features 1107558400, adjusting msgr requires for clients 2020-02-10 18:48:26.732163 7f9ce31dd880 0 osd.1 6002 crush map has features 1107558400 was 8705, adjusting msgr requires for mons 2020-02-10 18:48:26.732167 7f9ce31dd880 0 osd.1 6002 crush map has features 1107558400, adjusting msgr requires for osds 2020-02-10 18:48:26.732179 7f9ce31dd880 0 osd.1 6002 load_pgs 2020-02-10 18:48:31.939810 7f9ce31dd880 0 osd.1 6002 load_pgs opened 53 pgs 2020-02-10 18:48:31.940546 7f9ce31dd880 -1 osd.1 6002 log_to_monitors {default=true} 2020-02-10 18:48:31.942471 7f9ce31dd880 1 journal close /var/lib/ceph/osd/ceph-1/journal 2020-02-10 18:48:31.969205 7f9ce31dd880 -1 ESC[0;31m ** ERROR: osd init failed: (1) Operation not permittedESC[0m Its mounted: /dev/sdg1 3.7T 127G 3.6T 4% /var/lib/ceph/osd/ceph-1 Is there any way i can get the OSD.1 back in? Thanks a lot, mario

4 years, 2 months

2
1
0 0

Running cephadm as a nonroot user

by Jason Borden

We have been using ceph-deploy in our existing cluster running as a non root user with sudo permissions. I've been working on getting an octopus cluster working using cephadm. During bootstrap I ran into a "execnet.gateway_bootstrap.HostNotFound" issue. It turns out that the problem was caused by an sshd setting we use: "PermitRootLogin no". Since we do not allow root ssh login directly, is there a way to make cephadm use ssh as a nonroot user with sudo permissions like we did with ceph-deploy?

4 years, 2 months

2
4
0 0

How to monitor Ceph MDS operation latencies when slow cephfs performance

by jalagam.ceph＠gmail.com

Hello , Cephfs operations are slow in our cluster , I see low number of operations or throughput in the pools and all other resources as well. I think it is MDS operations that are causing the issue. I increased mds_cache_memory_limit to 3 GB from 1 GB but not seeing any improvements in the user access times. How do I monitor the MDS operations like metadata operations latencies including inode access times update time and directory operations latencies ? we am using 14.2.3 ceph version. I have increased mds_cache_memory_limit but not sure how to check what is been used and how effectively we are using it. # ceph config get mds.0 mds_cache_memory_limit 3221225472 I also see this , we are maninging PG using autoscale , however I see BIAS as 4.0 where as all pools have 1.0 not sure what is this number exactly and how it effect cluster . # ceph osd pool autoscale-status | egrep "cephfs|POOL" POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE cephfs01-metadata 1775M 3.0 167.6T 0.0000 4.0 8 on cephfs01-data0 739.5G 3.0 167.6T 0.0129 1.0 32 on There is one large OMAP. [root@knode25 /]# ceph health detail HEALTH_WARN 1 large omap objects LARGE_OMAP_OBJECTS 1 large omap objects 1 large objects found in pool 'cephfs01-metadata' Search the cluster log for 'Large omap object found' for more details. I recently had similar one and I was able to remove that by running deep scrub , not sure why they are keep forming and how to solve this for good ? Thanks, Uday.

4 years, 2 months

1
0
0 0

Write i/o in CephFS metadata pool

by Samy Ascha

Hi! I've been running CephFS for a while now and ever since setting it up, I've seen unexpectedly large write i/o on the CephFS metadata pool. The filesystem is otherwise stable and I'm seeing no usage issues. I'm in a read-intensive environment, from the clients' perspective and throughput for the metadata pool is consistently larger than that of the data pool. For example: # ceph osd pool stats pool cephfs_data id 1 client io 7.6 MiB/s rd, 19 KiB/s wr, 404 op/s rd, 1 op/s wr pool cephfs_metadata id 2 client io 338 KiB/s rd, 43 MiB/s wr, 84 op/s rd, 26 op/s wr I realise, of course, that this is a momentary display of statistics, but I see this unbalanced r/w activity consistently when monitoring it live. I would like some insight into what may be causing this large imbalance in r/w, especially since I'm in a read-intensive (web hosting) environment. Some of it may be expected in when considering details of my environment and CephFS implementation specifics, so please ask away if more details are needed. With my experience using NFS, I would start by looking at client io stats, like `nfsstat` and tuning e.g. mount options, but I haven't been able to find such statistics for CephFS clients. Is there anything of the sort for CephFS? Are similar stats obtainable in some other way? This might be a somewhat broad question and shallow description, so yeah, let me know if there's anything you would like more details on. Thanks a lot, Samy

4 years, 2 months

5
8
0 0

Is there a performance impact of enabling the iostat module?

by Marc Roos

Is there an performance impact of 'ceph mgr module enable iostat'?

4 years, 2 months

1
0
0 0

'ceph mgr module ls' does not show rbd_support

by Marc Roos

ceph mgr module ls does not show the rbd_support, should this not be listed?

4 years, 2 months

1
0
0 0

about rbd-nbd auto mount at boot time

by 6442642＠163.com

want to auto mount the ceph blockdriver at the boot time .becauseI use RBD-mirror,I just only can use nbd type to mount the blockdriver.I try to use /etc/ceph/rbdmap and /etc/fstab with _netdev.But I find it cann't mount at boot time.It just map for ndb type as /dev/nbd0.I need to mount manually use command "mount /dev/nbd0 /mnt".Has anyone solved this problem?

4 years, 2 months

1
0
0 0

As mon should be deployed in odd numbers, and I have a fourth node, can I deploy a fourth mds only? - 14.2.7

by marcopizzolo＠gmail.com

Presently I have about 1.2B objects (400M w/3 Replicas) and I'm finding the PG scrubbing and deep scrubbing are not completing. There is only 1 client accessing the data, a Samba server. I found large disparities in PG Distribution and Drive utilization. I enabled pg_autoscaler and found that it was reducing the number of PGs per OSD from 116 to 104.5 at this time, but it wasn't helping with space consumption equalization. I found loadbalancer and enabled that and it is in the process of evening out. As we were also having mds crashes even after increasing mds memory, i tried enabling multimds with rank:2 I currently have only 1 spare and would like to potentially enable the mds component on the fourth node (no mon present) but am having some difficulty. Is mon a requirement? I tried ceph-deploy mds create node4 but am having errors. I tried manually creating the /var/lib/ceph/mds/node4 directory and the command to create the keyring but still no joy. What am i missing? Thanks,

4 years, 2 months

3
3
0 0

MDS daemons seem to not be getting assigned a rank and crash. Nautilus 14.2.7

by Michael Sudnick

I made a bug report here: https://tracker.ceph.com/issues/44023 I updated from 14.2.6 yesterday and after the update my MDS daemons would not start. I looked at the logs and seemed to initially have an auth error. Setting the keyring location manually in ceph.conf fixed that, but I now get an error where my MDS daemons try to reconnect replay and rejoin, but then crash. Any suggestions on what I can do to trouble shoot? The big tracker post has some logs attached. Rolling back did not fix things. Thank you. -Michael

4 years, 2 months

1
0
0 0

getting rid of incomplete pg errors

by Hartwig Hauschild

Hi. before I descend into what happened and why it happened: I'm talking about a test-cluster so I don't really care about the data in this case. We've recently started upgrading from luminous to nautilus, and for us that means we're retiring ceph-disk in favour of ceph-volume with lvm and dmcrypt. Our setup is in containers and we've got DBs separated from Data. When testing our upgrade-path we discovered that running the host on ubuntu-xenial and the containers on centos-7.7 leads to lvm inside the containers not using lvmetad because it's too old. That in turn means that not running `vgscan --cache` on the host before adding a LV to a VG essentially zeros the metadata for all LVs in that VG. That happened on two out of three hosts for a bunch of OSDs and those OSDs are gone. I have no way of getting them back, they've been overwritten multiple times trying to figure out what went wrong. So now I have a cluster that's got 16 pgs in 'incomplete', 14 of them with 0 objects, 2 with about 150 objects each. I have found a couple of howtos that tell me to use ceph-objectstore-tool to find the pgs on the active osds and I've given that a try, but ceph-objectstore-tool always tells me it can't find the pg I am looking for. Can I tell ceph to re-init the pgs? Do I have to delete the pools and recreate them? There's no data I can't get back in there, I just don't feel like scrapping and redeploying the whole cluster. -- Cheers, Hardy

4 years, 2 months

3
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users February 2020