June 2021 - ceph-users - lists.ceph.io

Cephfs mount not recovering after icmp-not-reachable

by Simon Sutter

Hello everyone! We had a switch outage and the ceph kernel mount did not work anymore. This is the fstab entry: 10.99.10.1:/somefolder /cephfs ceph _netdev,nofail,name=cephcluster,secret=IsSecret 0 0 I reproduced it with disabling the vlan on the switch on which the ceph is reachable, which gives a icmp-not-reachable. I did this for five minutes, after that, "ls /cephfs" just gives a "permission denied" in dmesg i can see this: [ 1412.994921] libceph: mon1 10.99.10.4:6789 session lost, hunting for new mon [ 1413.009325] libceph: mon0 10.99.10.1:6789 session established [ 1452.998646] libceph: mon2 10.99.15.3:6789 session lost, hunting for new mon [ 1452.998679] libceph: mon0 10.99.10.1:6789 session lost, hunting for new mon [ 1461.989549] libceph: mon4 10.99.15.5:6789 socket closed (con state CONNECTING) --- [ 1787.045148] libceph: mon3 10.99.15.4:6789 socket closed (con state CONNECTING) [ 1787.062587] libceph: mon0 10.99.10.1:6789 session established [ 1787.086103] libceph: mon4 10.99.15.5:6789 session established [ 1814.028761] libceph: mds0 10.99.10.4:6801 socket closed (con state OPEN) [ 1815.029811] libceph: mds0 10.99.10.4:6801 connection reset [ 1815.029829] libceph: reset on mds0 [ 1815.029831] ceph: mds0 closed our session [ 1815.029833] ceph: mds0 reconnect start [ 1815.052219] ceph: mds0 reconnect denied [ 1815.052229] ceph: dropping dirty Fw state for ffff9d9085da1340 1099512175611 [ 1815.052231] ceph: dropping dirty+flushing Fw state for ffff9d9085da1340 1099512175611 [ 1815.273008] libceph: mds0 10.99.10.4:6801 socket closed (con state NEGOTIATING) [ 1816.033241] ceph: mds0 rejected session [ 1829.018643] ceph: mds0 hung [ 1880.088504] ceph: mds0 came back [ 1880.088662] ceph: mds0 caps renewed [ 1880.094018] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm [ 1881.100367] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm [ 2046.768969] conntrack: generic helper won't handle protocol 47. Please consider loading the specific helper module. [ 2061.731126] ceph: get_quota_realm: ino (10000000afe.fffffffffffffffe) null i_snap_realm Is this a bug to report or wrong configuration? Did someone else had this before? To solve the problem, a simple remount does the trick. Thanks in advance Simon

2 years, 11 months

3
2
0 0

CephFS design

by Szabo, Istvan (Agoda)

Hi, Can you suggest me what is a good cephfs design? I've never used it, only rgw and rbd we have, but want to give a try. Howvere in the mail list I saw a huge amount of issues with cephfs so would like to go with some let's say bulletproof best practices. Like separate the mds from mon and mgr? Need a lot of memory? Should be on ssd or nvme? How many cpu/disk ... Very appreciate it. Istvan Szabo Senior Infrastructure Engineer --------------------------------------------------- Agoda Services Co., Ltd. e: istvan.szabo(a)agoda.com<mailto:istvan.szabo@agoda.com> --------------------------------------------------- ________________________________ This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.

2 years, 11 months

7
11
0 0

cephadm failed in Pacific release: Unable to set up "admin" label

by Ralph Soika

hi, if you follow the latest install guide here <https://docs.ceph.com/en/latest/cephadm/install/> to install pacific release, the bootstrap command will print the following error message: .... ...... mgr epoch 13 is available Generating a dashboard self-signed certificate... Creating initial admin user... Fetching dashboard port number... Ceph Dashboard is now available at: URL: https://ceph-1:8443/ User: admin Password: xxx Enabling client.admin keyring and conf on hosts with "admin" label Non-zero exit code 22 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e CONTAINER_IMAGE=docker.io/ceph/ceph:v16 -e NODE_NAME=tikal-ceph-1 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/d6c1ba28-cd0d-11eb-8b39-960000bd038e:/var/log/ceph:z -v /tmp/ceph-tmp9t3y8y39:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpaepq1rte:/etc/ceph/ceph.conf:z docker.io/ceph/ceph:v16 orch client-keyring set client.admin label:_admin /usr/bin/ceph: stderr Invalid command: client-keyring not in start|stop|restart|redeploy|reconfig /usr/bin/ceph: stderr orch start|stop|restart|redeploy|reconfig <service_name> : Start, stop, restart, redeploy, or reconfig an entire service (i.e. all daemons) /usr/bin/ceph: stderr Error EINVAL: invalid command Unable to set up "admin" label; assuming older version of Ceph ...... .... As a result the first node is not working correctly and you can not add mom hosts. I tried to set the label manually afterwards with : # ceph orch host label add ceph-1 _admin But this seems to have no effect. Is this problem known? So only solution seems to stay with the octopus release :-/ === Ralph

2 years, 11 months

2
3
0 0

does ceph rgw has any option to limit bandwidth

by Zhenshi Zhou

Hi, Is there any option of rados gateway that limit bandwidth?

2 years, 11 months

5
6
0 0

Creating a role in another tenant seems to be possible

by Daniel Iwan

Hi It seems that with command like this aws --profile=my-user-tenant1 --endpoint=$HOST_S3_API --region="" iam create-role --role-name="tenant2\$TemporaryRole" --assume-role-policy-document file://json/trust-policy-assume-role.json I can create a role in another tenant. Executing user have roles:* capability which I think is necessary to be able to create roles, but at the same time it seems to be a global ability, for all tenants. Similarly, a federated user who assumes a role with iam:CreateRole permission can create an arbitrary role like below. aws --endpoint=$HOST_S3_API --region="" iam create-role --role-name="tenant2\$TemporaryRole" --assume-role-policy-document file://json/trust-policy-assume-role.json Example permission policy { "Statement":[ {"Effect":"Allow","Action":["iam:GetRole"]}, {"Effect":"Allow","Action":["iam:CreateRole"]} ] } Capability roles:* is not needed in this case, which I think is correct, because only permission policy of the assumed role is checked. Getting information about a role from other tenants is possible with iam:GetRole. This is less controversial but I would still expect it to be scoped to the user's tenant unless explicit tenant name is stated in the policy like this {"Effect":"Allow","Action":["iam:GetRole"],"Resource":"arn:aws:iam::tenant2:*"} Possibly I'm missing something. Why is crossing tenants possible? Regards Daniel

2 years, 11 months

2
4
0 0

stretched cluster or not, with mon in 3 DC and osds on 2 DC

by aderumier＠odiso.com

Hi, I'm currently reading the documentation about stretched cluster, I would like to known if it's needed or not with this kind of 3 dc setup: 3km (0.2ms) DC1--------------DC2 30km(3ms) | | 30km (2-3ms) |--------DC3------ DC1 && DC2 are near each other, small latency. (0.2ms) DC3 is at 30km with bigger latency. (2ms) separated links between dc with different physical path 1 monitor on each dc osd on DC1/DC2, with size=4 Cluster is full nvme or ssd, lowest latency is required for osd replication. Now, I really don't known if latency monitor at DC3 could have an impact on osd read/write latency is this monitor elected ? vs stretched cluster with osd only use local dc monitors ? What is the advantage of stretch cluster here ? (with good redudant links between sites)

2 years, 11 months

2
1
0 0

recovery_unfound during scrub with auto repair = true

by Dan van der Ster

Hi all, The cluster here is running v14.2.20 and is used for RBD images. We have a PG in recovery_unfound state and since this is the first time we've had this occur, we wanted to get your advice on the best course of action. PG 4.1904 went into state active+recovery_unfound+degraded+repair [1] during normal scrubbing (but note that we have `osd scrub auto repair = true`). 2021-06-13 03:15:11.559680 osd.951 (osd.951) 138 : cluster [DBG] 4.1904 repair starts 2021-06-13 04:00:49.369256 osd.951 (osd.951) 139 : cluster [ERR] 4.1904 shard 951 soid 4:209cfddb:::rbd_data.3a4ff12d847b61.000000000001c39e:head : candidate had a read error The scrub detected a read error on the primary of this PG, and tried to repair it by reading from the other 2 osds: Jun 13 04:00:46 xxx kernel: sd 0:0:25:0: [sdp] tag#6 FAILED Result: hostbyte=DID_OK driverbyte=DR Jun 13 04:00:46 xxx kernel: sd 0:0:25:0: [sdp] tag#6 Sense Key : Medium Error [current] [descript Jun 13 04:00:46 xxx kernel: sd 0:0:25:0: [sdp] tag#6 Add. Sense: Unrecovered read error Jun 13 04:00:46 xxx kernel: sd 0:0:25:0: [sdp] tag#6 CDB: Read(16) 88 00 00 00 00 02 ba 8c 0b 00 Jun 13 04:00:46 xxx kernel: blk_update_request: critical medium error, dev sdp, sector 1171967531 But it seems that the other 2 osds could not repair this failed read on the primary because they don't have the correct version of the object: 2021-06-13 04:28:29.412765 osd.951 (osd.951) 140 : cluster [ERR] 4.1904 repair 0 missing, 1 inconsistent objects 2021-06-13 04:28:29.413320 osd.951 (osd.951) 141 : cluster [ERR] 4.1904 repair 1 errors, 1 fixed 2021-06-13 04:28:29.445659 osd.14 (osd.14) 414 : cluster [ERR] 4.1904 push 4:209cfddb:::rbd_data.3a4ff12d847b61.000000000001c39e:head v 3592634'367863320 failed because local copy is 3593555'368312656 2021-06-13 04:28:29.472554 osd.344 (osd.344) 124 : cluster [ERR] 4.1904 push 4:209cfddb:::rbd_data.3a4ff12d847b61.000000000001c39e:head v 3592634'367863320 failed because local copy is 3593555'368312656 2021-06-13 04:28:30.863807 mgr.yyy (mgr.692832499) 648287 : cluster [DBG] pgmap v557097: 19456 pgs: 1 active+recovery_unfound+degraded+repair, 2 active+clean+scrubbing, 19423 active+clean, 30 active+clean+scrubbing+deep+repair; 1.3 PiB data, 4.0 PiB used, 2.1 PiB / 6.1 PiB avail; 350 MiB/s rd, 766 MiB/s wr, 16.93k op/s; 3/1063641423 objects degraded (0.000%); 1/354547141 objects unfound (0.000%) I don't understand how the versions of the objects would get out of sync -- there have been no other recent failures on these disks, AFAICT. So my best guess is that the IO error on 951 confused the repair process -- the osd.951 tried to recover the non-latest version of the object. (This would imply that the object versions on osds 14 and 344 are in fact the correct newest versions). We have a few ideas how to fix this: * osd 951 is sick, so drain it by setting `ceph osd primary-affinity 951 0` and `ceph osd out 951` * osd 951 is really sick, so just stop it now and backfill its PGs to other OSDs. * Don't stop osd 951 yet: Restart all three relevant OSDs and see if that fixes the object versions. * Don't drain osd 951 yet: Make OSD 14 or 344 the primary for this PG, (e.g. ceph osd primary-affinity 951 0) then run `ceph pg repair 4.1904` so that the version from osds 14/344 can be pushed. * Use mark_unfound_lost revert, or delete. (and inform the user their image to fsck their image). Does anyone have some recent experience or advice on this issue? Best Regards, Dan [1] # ceph pg 4.1904 query { "state": "active+recovery_unfound+degraded+repair", "snap_trimq": "[1c7fd~1,1c7ff~1,1c801~1,1c803~1,1c805~1]", "snap_trimq_len": 5, "epoch": 3593586, "up": [ 951, 344, 14 ], "acting": [ 951, 344, 14 ], "acting_recovery_backfill": [ "14", "344", "951" ], ...

2 years, 11 months

1
2
0 0

bluestore label returned: (2) No such file or directory

by Karl Mardoff Kittilsen

Hi List I have a osd (83) that fails to start. It is made up of one 4TB drive and an 80GB DB on nvme. There was a cluster-full situation that is now solved, however I am quite sure the issue with this particular osd is unrelated. When I try to start the osd it failes to read the label of the block.db device failing with the followint lines: 1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-83/block.db size 80 GiB -1 bluestore(/var/lib/ceph/osd/ceph-83) _minimal_open_bluefs check block device(/var/lib/ceph/osd/ceph-83/block.db) label returned: (2) No such file or directory 1 bdev(0x563d8822a700 /var/lib/ceph/osd/ceph-83/block.db) close 1 bdev(0x563d8822a000 /var/lib/ceph/osd/ceph-83/block) close -1 osd.83 0 OSD:init: unable to mount object store -1 ** ERROR: osd init failed: (2) No such file or directory (Full log: https://gist.github.com/NightDog/7b50349da1410bb05bd7f4d54a02f055) The last thing that happened to the OSD before it started to fail booting was it being terminated during what I believe was a reboot: received signal: Terminated from Kernel ( Could be generated by pthread_kill(), raise(), abort(), alarm() ) UID: 0 Last lines++ of the last successful run: https://gist.github.com/NightDog/fd9b4b7b3e0c0c2ba29ce5d325bb97c6 When I try to run ceph-bluestore-tool --log-level 30 show-label on the block.db it returns: "unable to read label for /dev/ceph-00ed472c-f900-4dc3-9ddc-0e2f3b6547e3/osd-db-bb0eaa16-a1e0-4985-b4bd-74799e5226be: (2) No such file or directory" The block returns the label fine (see master gist): https://gist.github.com/NightDog/4518bf11b364170911e5743b5ed0f614 The strange thing is however that lvs -o lv_tags returns just fine for the block.db: root@ceph-node201:~# lvs -o lv_tags /dev/ceph-00ed472c-f900-4dc3-9ddc-0e2f3b6547e3/osd-db-bb0eaa16-a1e0-4985-b4bd-74799e5226be LV Tags ceph.block_device=/dev/ceph-ff60b68a-26fe-4294-8bec-4a9c329e858d/osd-block-73ab12e6-7758-4ebe-9319-5935309fcacd,ceph.block_uuid=nbRXYl-fRrQ-qyYP-D93c-IGct-yKg4-rujDOX,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=f4495398-a8c4-4ad9-8219-80c48625abdf,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/ceph-00ed472c-f900-4dc3-9ddc-0e2f3b6547e3/osd-db-bb0eaa16-a1e0-4985-b4bd-74799e5226be,ceph.db_uuid=K09p3L-QV06-LOLO-uVeT-2ulz-GD3O-CEyRcs,ceph.encrypted=0,ceph.osd_fsid=73ab12e6-7758-4ebe-9319-5935309fcacd,ceph.osd_id=83,ceph.osdspec_affinity=osd-spec-2xx,ceph.type=db,ceph.vdo=0 So it seems to me that for some reason, ceph-bluestore-tool fails to read the label of the block.db device, even tho it is there, and then fails the startup of the OSD. Trying to write keys with ceph-bluestore-tool set-label-key fails with the same error message. I see no reason why there should be any damage to either the .db or block device, and since the labels are there in LVM, guess ceph-bluestore-tool errors out on something else? Would it be possible to get some help with regards to getting this .db and OSD back up again? Thanks! PS: Running version 15.2.8, also tried with 16.2.3-> show-label, with same result. -- Regards Karl M. Kittilsen

2 years, 11 months

1
0
0 0

Ceph Poor RBD Performance

by Eren Cankurtaran

Hi. I'm having performance issue about ceph rbd. The performance is not i expected according to my node metrics. here're the metrics. I've used Calico as CNI. version: Rook-ceph 1.6 I've used stock yaml files and rook is not running on host network Centos 8 Stream [root@node4 ~]# uname -a Linux node4 4.18.0-240.el8.x86_64 #1 SMP Fri Sep 25 19:48:47 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux [root@rook-ceph-tools-fc5f9586c-nb2h7 /]# ceph health HEALTH_WARN mons are allowing insecure global_id reclaim; clock skew detected on mon.b, mon.c; 1 pool(s) do not have an application enabled [root@node1 ~]# kubectl version Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.7", GitCommit:"132a687512d7fb058d0f5890f07d4121b3f0a2e2", GitTreeState:"clean", BuildDate:"2021-05-12T12:40:09Z", GoVersion:"go1.15.12", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.7", GitCommit:"132a687512d7fb058d0f5890f07d4121b3f0a2e2", GitTreeState:"clean", BuildDate:"2021-05-12T12:32:49Z", GoVersion:"go1.15.12", Compiler:"gc", Platform:"linux/amd64"} [root@node4 ~]# iperf3 -c 172.16.11.181 -p 3000 M Connecting to host 172.16.11.181, port 3000 [ 5] local 172.16.11.180 port 46390 connected to 172.16.11.181 port 3000 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 201 MBytes 1.69 Gbits/sec 488 48.1 KBytes [ 5] 1.00-2.00 sec 160 MBytes 1.34 Gbits/sec 397 39.6 KBytes [ 5] 2.00-3.00 sec 201 MBytes 1.68 Gbits/sec 513 69.3 KBytes [ 5] 3.00-4.00 sec 200 MBytes 1.68 Gbits/sec 374 38.2 KBytes [ 5] 4.00-5.00 sec 199 MBytes 1.67 Gbits/sec 402 48.1 KBytes [ 5] 5.00-6.00 sec 201 MBytes 1.69 Gbits/sec 559 48.1 KBytes [ 5] 6.00-7.00 sec 204 MBytes 1.71 Gbits/sec 470 45.2 KBytes [ 5] 7.00-8.00 sec 199 MBytes 1.67 Gbits/sec 575 46.7 KBytes [ 5] 8.00-9.00 sec 200 MBytes 1.68 Gbits/sec 404 49.5 KBytes [ 5] 9.00-10.00 sec 200 MBytes 1.68 Gbits/sec 391 49.5 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.92 GBytes 1.65 Gbits/sec 4573 sender [ 5] 0.00-10.04 sec 1.92 GBytes 1.64 Gbits/sec receiver *In pod that mounted rbd volume cassandra@k8ssandra-dc1-default-sts-0:/var/lib/cassandra$ dd if=/dev/zero of=/var/lib/cassandra/test.img bs=1G count=1 oflag=dsync 1+0 records in 1+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 28.051 s, 38.3 MB/s [root@rook-ceph-tools-fc5f9586c-nb2h7 /]# ceph status cluster: id: d6584e08-f8f5-43e4-a258-8d652cc28e0a health: HEALTH_WARN mons are allowing insecure global_id reclaim clock skew detected on mon.b, mon.c 1 pool(s) do not have an application enabled services: mon: 3 daemons, quorum a,b,c (age 32h) mgr: a(active, since 3h) osd: 24 osds: 24 up (since 32h), 24 in (since 3d) data: pools: 4 pools, 97 pgs objects: 181.38k objects, 686 GiB usage: 1.3 TiB used, 173 TiB / 175 TiB avail pgs: 97 active+clean io: client: 308 KiB/s wr, 0 op/s rd, 4 op/s wr [root@rook-ceph-tools-fc5f9586c-nb2h7 /]# rados bench -p scbench 15 write seq --no-cleanup hints = 1 Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304 for up to 15 seconds or 0 objects Object prefix: benchmark_data_rook-ceph-tools-fc5f9586c-nb2_37098 sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s) 0 0 0 0 0 0 - 0 1 16 19 3 11.9985 12 0.480534 0.377567 2 16 22 6 11.9986 12 1.77015 1.04767 3 16 27 11 14.6652 20 2.48028 1.26595 4 16 29 13 12.9988 8 0.766731 1.18763 5 16 34 18 14.3986 20 4.95184 1.44111 6 16 41 25 16.6651 28 0.742371 1.91373 7 16 45 29 16.5699 16 1.00815 2.14879 8 16 49 33 16.4985 16 5.65826 2.1447 9 16 57 41 18.2206 32 0.941691 2.46741 10 16 65 49 19.5982 32 4.63243 2.45243 11 16 70 54 19.6346 20 0.122698 2.4876 12 16 74 58 19.3316 16 0.700267 2.50753 13 16 76 60 18.4599 8 1.3582 2.54338 14 16 78 62 17.7127 8 5.65752 2.63185 15 16 80 64 17.0651 8 5.25659 2.77212 16 15 80 65 16.2485 4 5.95131 2.82103 17 11 80 69 16.2337 16 10.4721 3.09282 18 6 80 74 16.4429 20 7.71085 3.23797 19 1 80 79 16.63 20 5.98475 3.47738 Total time run: 19.0674 Total writes made: 80 Write size: 4194304 Object size: 4194304 Bandwidth (MB/sec): 16.7826 Stddev Bandwidth: 8.02919 Max bandwidth (MB/sec): 32 Min bandwidth (MB/sec): 4 Average IOPS: 4 Stddev IOPS: 2.0073 Max IOPS: 8 Min IOPS: 1 Average Latency(s): 3.51805 Stddev Latency(s): 2.97502 Max latency(s): 10.4721 Min latency(s): 0.104524 [root@rook-ceph-tools-fc5f9586c-nb2h7 /]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 174.65735 root default -5 36.38695 host node4 0 hdd 7.27739 osd.0 up 1.00000 1.00000 4 hdd 7.27739 osd.4 up 1.00000 1.00000 8 hdd 7.27739 osd.8 up 1.00000 1.00000 12 hdd 7.27739 osd.12 up 1.00000 1.00000 16 hdd 7.27739 osd.16 up 1.00000 1.00000 -9 36.38695 host node5 2 hdd 7.27739 osd.2 up 1.00000 1.00000 6 hdd 7.27739 osd.6 up 1.00000 1.00000 10 hdd 7.27739 osd.10 up 1.00000 1.00000 14 hdd 7.27739 osd.14 up 1.00000 1.00000 18 hdd 7.27739 osd.18 up 1.00000 1.00000 -7 36.38695 host node6 1 hdd 7.27739 osd.1 up 1.00000 1.00000 5 hdd 7.27739 osd.5 up 1.00000 1.00000 9 hdd 7.27739 osd.9 up 1.00000 1.00000 13 hdd 7.27739 osd.13 up 1.00000 1.00000 17 hdd 7.27739 osd.17 up 1.00000 1.00000 -3 29.10956 host node7 3 hdd 7.27739 osd.3 up 1.00000 1.00000 7 hdd 7.27739 osd.7 up 1.00000 1.00000 11 hdd 7.27739 osd.11 up 1.00000 1.00000 15 hdd 7.27739 osd.15 up 1.00000 1.00000 -11 36.38695 host node8 19 hdd 7.27739 osd.19 up 1.00000 1.00000 20 hdd 7.27739 osd.20 up 1.00000 1.00000 21 hdd 7.27739 osd.21 up 1.00000 1.00000 22 hdd 7.27739 osd.22 up 1.00000 1.00000 23 hdd 7.27739 osd.23 up 1.00000 1.00000 [root@node1 ~]# kubectl get pods -n rook-ceph NAME READY STATUS RESTARTS AGE csi-cephfsplugin-9x4dc 3/3 Running 9 14d csi-cephfsplugin-cv7k9 3/3 Running 9 14d csi-cephfsplugin-f6s4k 3/3 Running 22 14d csi-cephfsplugin-fn9w4 3/3 Running 12 14d csi-cephfsplugin-provisioner-59499cbcdd-txtjq 6/6 Running 0 33h csi-cephfsplugin-provisioner-59499cbcdd-wj9dx 6/6 Running 11 4d7h csi-cephfsplugin-tdp99 3/3 Running 9 14d csi-rbdplugin-4bv84 3/3 Running 12 14d csi-rbdplugin-ddgq8 3/3 Running 22 14d csi-rbdplugin-hnnx2 3/3 Running 10 14d csi-rbdplugin-provisioner-857d65496c-qzbcq 6/6 Running 0 33h csi-rbdplugin-provisioner-857d65496c-s6nnd 6/6 Running 9 4d7h csi-rbdplugin-prthz 3/3 Running 9 14d csi-rbdplugin-tjj7z 3/3 Running 12 14d rook-ceph-crashcollector-node4-5bb88d4866-rfv54 1/1 Running 1 4d7h rook-ceph-crashcollector-node5-75dcbb9f45-5tlv8 1/1 Running 0 33h rook-ceph-crashcollector-node6-777c9f579b-zzf69 1/1 Running 0 33h rook-ceph-crashcollector-node7-84464d7cc5-2zthq 1/1 Running 0 33h rook-ceph-crashcollector-node8-66bcdc9c4b-drcj2 1/1 Running 0 3d12h rook-ceph-mgr-a-fb794cf97-cbmws 1/1 Running 26 34h rook-ceph-mon-a-58bf5db667-lhv26 1/1 Running 0 33h rook-ceph-mon-b-78df76b9f8-p7w69 1/1 Running 0 33h rook-ceph-mon-c-6fc84459fd-nc8hx 1/1 Running 0 4d8h rook-ceph-operator-65965c66b5-z5xfl 1/1 Running 2 4d7h rook-ceph-osd-0-59747d74d6-lg5lh 1/1 Running 16 4d7h rook-ceph-osd-1-56c66f7d49-vxj9m 1/1 Running 0 33h rook-ceph-osd-10-6cf5f76d57-g5jhb 1/1 Running 0 33h rook-ceph-osd-11-5446ff487f-qsdmw 1/1 Running 0 33h rook-ceph-osd-12-db4859779-5vrvm 1/1 Running 8 4d7h rook-ceph-osd-13-65d4859986-zx8kx 1/1 Running 0 33h rook-ceph-osd-14-5c6db6c8c9-ghhtk 1/1 Running 0 33h rook-ceph-osd-15-568c6bbf64-m2f2v 1/1 Running 0 33h rook-ceph-osd-16-7d475cc4cc-9lzkp 1/1 Running 8 4d7h rook-ceph-osd-17-6487d4ff79-5x8bh 1/1 Running 0 33h rook-ceph-osd-18-7dd896c985-bjvbz 1/1 Running 0 33h rook-ceph-osd-19-5ccfbf896f-gll6k 1/1 Running 1 4d8h rook-ceph-osd-2-7db5897c7b-fxc4k 1/1 Running 0 33h rook-ceph-osd-20-7b7475f95c-cfprr 1/1 Running 1 4d8h rook-ceph-osd-21-7c544458cf-mfbzr 1/1 Running 0 4d8h rook-ceph-osd-22-76ff6df84-tf4hx 1/1 Running 0 4d8h rook-ceph-osd-23-86b976fdb4-c5xz9 1/1 Running 0 4d8h rook-ceph-osd-3-6ccbd79b9d-dvrwj 1/1 Running 0 33h rook-ceph-osd-4-56c4c85686-pm488 1/1 Running 9 4d7h rook-ceph-osd-5-75dd798967-rmhrp 1/1 Running 0 33h rook-ceph-osd-6-f66fd44fb-f2h2d 1/1 Running 0 33h rook-ceph-osd-7-6fbb57486c-4xvzh 1/1 Running 0 33h rook-ceph-osd-8-59567ccf6-xnmx7 1/1 Running 8 4d7h rook-ceph-osd-9-5b9bc76c98-f8dbf 1/1 Running 0 33h rook-ceph-osd-prepare-node4-jgwvb 0/1 Completed 0 3h1m rook-ceph-osd-prepare-node5-jwvsk 0/1 Completed 0 3h1m rook-ceph-osd-prepare-node6-r2ps9 0/1 Completed 0 3h1m rook-ceph-osd-prepare-node7-cmfsf 0/1 Completed 0 3h1m rook-ceph-osd-prepare-node8-kk256 0/1 Completed 0 3h1m rook-ceph-tools-fc5f9586c-nb2h7 1/1 Running 0 34h

2 years, 11 months

1
0
0 0

Kubernetes - How to create a PersistentVolume on an existing durable ceph volume?

by Ralph Soika

Hi, I have setup a ceph cluster (octopus) and installed he rbd plugins/provisioner in my Kubernetes cluster. I can create dynamically FS and Block Volumes which is fine. For that I have created the following the following storageClass apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: ceph provisioner: rbd.csi.ceph.com parameters: clusterID: <clusterID> pool: kubernetes imageFeatures: layering csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret csi.storage.k8s.io/provisioner-secret-namespace: ceph-system csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret csi.storage.k8s.io/controller-expand-secret-namespace: ceph-system csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret csi.storage.k8s.io/node-stage-secret-namespace: ceph-system reclaimPolicy: Delete allowVolumeExpansion: true mountOptions: - discard This works fine for ephemeral dynamically crated volumes. But now I want to use durable volume with the reclaimPolicy:Retain. I expect that I need to create the image in my kubernetes pool on the ceph cluster first - which I have done. I defined the following new storage class with the reclaimPolicy 'Retain': apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: ceph-durable provisioner: rbd.csi.ceph.com parameters: clusterID: <clusterID> pool: kubernetes imageFeatures: layering csi.storage.k8s.io/provisioner-secret-name: csi-rbd-secret csi.storage.k8s.io/provisioner-secret-namespace: ceph-system csi.storage.k8s.io/controller-expand-secret-name: csi-rbd-secret csi.storage.k8s.io/controller-expand-secret-namespace: ceph-system csi.storage.k8s.io/node-stage-secret-name: csi-rbd-secret csi.storage.k8s.io/node-stage-secret-namespace: ceph-system reclaimPolicy: Retain allowVolumeExpansion: true mountOptions: - discard And finally I created the following PersistentVolume and PersistentVolumeClaim: --- kind: PersistentVolume apiVersion: v1 metadata: name: demo-internal-index spec: capacity: storage: 1Gi volumeMode: Filesystem accessModes: - ReadWriteOnce claimRef: namespace: office-demo-internal name: index csi: driver: driver.ceph.io fsType: ext4 volumeHandle: demo-internal-index storageClassName: ceph-durable --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: index namespace: office-demo-internal spec: accessModes: - ReadWriteOnce storageClassName: ceph-durable resources: requests: storage: 1Gi volumeName: "demo-internal-index" But this seems not to work and I can see the following deployment warning: attachdetach-controller AttachVolume.Attach failed for volume "demo-internal-index" : attachdetachment timeout for volume demo-internal-index But the PV exists: $ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE demo-internal-index 1Gi RWO Retain Bound office-demo-internal/index ceph-durable 2m35s and also the PVC exists: $ kubectl get pvc -n office-demo-internal NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE index Bound demo-internal-index 1Gi RWO ceph-durable 53m I guess my PV object is nonsense? Can someone provide me an example how to setup the PV object in Kubernetes. I only found examples where the Ceph Monitoring IPs and the user/password is configured within the PV object. But I would expect that this is covered by the storage class already? Thanks for your help === Ralph

2 years, 11 months

1
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2021