June 2020 - ceph-users - lists.ceph.io

changing acces vlan for all the OSDs - potential downtime ?

by Adrian Nicolae

Hi all, I have a Ceph cluster with a standard setup : - the public network : MONs and OSDs conected in the same agg switch with ports in the same access vlan - private network : OSDs connected in another switch with a second eth connected in another access vlan I need to change the public vlan on the first switch and the private vlan and the second switch. Although it should be a trivial operation (just change the vlan range ports in a single command), it means that all the OSDs and MONs will not be able to communicate with each other for a few seconds . (first on the public network, then on the private network). Do you know if this very short period of downtime will mess up the cluster somehow ? Is there a best practice on how to do this safely ? Thank you , Adrian.

3 years, 10 months

3
2
0 0

Cephadm and Ceph versions

by biohazd＠yahoo.com

Hi I had a cluster on v13 (mimic) and have converted it to Octopus (15.2.3) and using Cephadm. In the dashboard is showing as all v15 What do I need to do with the Ceph rpms that are installed as they are all Ceph version 13. Do I remove them and install Ceph rpms with version 15 ? Regards Andy

3 years, 10 months

2
1
0 0

Change mon bind address / Change IPs with the orchestrator

by Simon Sutter

Hello, I think I missunderstood the internal / public network concepts in the docs https://docs.ceph.com/docs/master/rados/configuration/network-config-ref/. Now there are two questions: - Is it somehow possible to bind the MON daemon to 0.0.0.0? I tried it with manually add the ip in /var/lib/ceph/{UUID}/mon.node01/config [mon.node01] public bind addr = 0.0.0.0 But that does not work, in netstat I can see, the mon still binds to it's internal ip. Is this an expected behaviour? If I set this value to the public ip, the other nodes cannot communicate with it, so this leads to the next question: - What's the Right way to correct the problem with the orchestrator? So the correct way to configure the ip's, would be to set every mon, mds and so on, to the public ip and just let the osd's stay on their internal ip. (described here https://docs.ceph.com/docs/master/rados/configuration/network-config-ref/) Do I have to remove every daemon and redeploy them with "ceph orch daemon rm" / "ceph orch apply"? Or do I have to go to every node and manually apply the settings in the daemon config file? Thanks in advance, Simon

3 years, 10 months

2
2
0 0

log_channel(cluster) log [ERR] : Error -2 reading object

by Frank Schilder

Hi all, I found these messages today: 2020-06-04 17:07:57.471 7fa0aa16e700 -1 log_channel(cluster) log [ERR] : Error -2 reading object 14:e4c5ebb6:::1000203c59b.00000002:head 2020-06-04 17:08:04.236 7fa0aa16e700 -1 log_channel(cluster) log [ERR] : Error -2 reading object 14:e4c9a1a1:::1000203ad7f.00000000:head in one of our OSD logs. The disk is healthy according to smartctl. Should I worry about that? Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

3 years, 10 months

2
2
0 0

bad balacing (octopus)

by Ml Ml

Hello, any idea why it´s so bad balanced? e.g.: osd.52 (82%) vs osd.34 (29%) I did run "/usr/bin/ceph osd reweight-by-utilization " by cron for some time, since i was low on space for some time and that helped a bit. What should i do next? Here is some info: root@ceph01:~# ceph -s cluster: id: 5436dd5d-83d4-4dc8-a93b-60ab5db145df health: HEALTH_OK services: mon: 3 daemons, quorum ceph01,ceph02,ceph03 (age 4h) mgr: ceph01(active, since 4h), standbys: ceph03, ceph02 osd: 55 osds: 55 up (since 4h), 54 in (since 4h); 10 remapped pgs data: pools: 3 pools, 2049 pgs objects: 8.17M objects, 29 TiB usage: 89 TiB used, 45 TiB / 133 TiB avail pgs: 27740/24497208 objects misplaced (0.113%) 2034 active+clean 9 active+remapped+backfilling 5 active+clean+scrubbing+deep 1 active+remapped+backfill_wait io: recovery: 172 MiB/s, 45 objects/s root@ceph01:~# ceph balancer status { "active": true, "last_optimize_duration": "0:00:00.017034", "last_optimize_started": "Thu Jun 4 15:26:28 2020", "mode": "upmap", "optimize_result": "Unable to find further optimization, or pool(s) pg_num is decreasing, or distribution is already perfect", "plans": [] } root@ceph01:~# ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 111.70750 - 133 TiB 89 TiB 88 TiB 4.9 GiB 240 GiB 44 TiB 66.79 1.00 - root default -2 25.44400 - 25 TiB 19 TiB 19 TiB 1.0 GiB 49 GiB 6.1 TiB 75.99 1.14 - host ceph01 0 hdd 2.39999 1.00000 2.7 TiB 1.9 TiB 1.8 TiB 107 MiB 5.2 GiB 835 GiB 69.45 1.04 130 up osd.0 1 hdd 2.29999 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 79 MiB 5.4 GiB 920 GiB 66.36 0.99 124 up osd.1 4 hdd 2.67029 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 127 MiB 5.6 GiB 620 GiB 77.31 1.16 144 up osd.4 8 hdd 2.39999 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 111 MiB 5.3 GiB 755 GiB 72.38 1.08 135 up osd.8 11 hdd 1.71660 1.00000 1.7 TiB 1.3 TiB 1.3 TiB 56 MiB 2.6 GiB 418 GiB 76.22 1.14 92 up osd.11 12 hdd 2.67029 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 122 MiB 5.3 GiB 625 GiB 77.14 1.16 144 up osd.12 14 hdd 2.67029 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 114 MiB 5.6 GiB 642 GiB 76.53 1.15 144 up osd.14 18 hdd 2.79999 1.00000 2.7 TiB 2.3 TiB 2.3 TiB 143 MiB 6.0 GiB 411 GiB 84.95 1.27 158 up osd.18 22 hdd 1.71660 1.00000 1.7 TiB 1.3 TiB 1.3 TiB 59 MiB 2.4 GiB 388 GiB 77.94 1.17 94 up osd.22 30 hdd 1.79999 0.95001 1.7 TiB 1.4 TiB 1.4 TiB 71 MiB 2.9 GiB 324 GiB 81.58 1.22 96 up osd.30 32 hdd 0.50000 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up osd.32 33 hdd 1.79999 0.95001 1.6 TiB 1.3 TiB 1.3 TiB 54 MiB 2.4 GiB 323 GiB 80.55 1.21 91 up osd.33 -3 24.52742 - 26 TiB 19 TiB 19 TiB 1.1 GiB 54 GiB 6.8 TiB 74.10 1.11 - host ceph02 2 hdd 1.00000 1.00000 1.7 TiB 827 GiB 825 GiB 29 MiB 2.2 GiB 931 GiB 47.05 0.70 56 up osd.2 3 hdd 2.89999 0.95001 2.7 TiB 2.2 TiB 2.2 TiB 123 MiB 6.1 GiB 454 GiB 83.38 1.25 154 up osd.3 7 hdd 2.67029 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 152 MiB 5.8 GiB 540 GiB 80.25 1.20 150 up osd.7 9 hdd 2.67029 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 172 MiB 6.0 GiB 553 GiB 79.79 1.19 149 up osd.9 13 hdd 1.79999 1.00000 2.4 TiB 1.4 TiB 1.4 TiB 70 MiB 4.6 GiB 979 GiB 59.91 0.90 100 up osd.13 16 hdd 2.89999 0.95001 2.7 TiB 2.2 TiB 2.2 TiB 140 MiB 6.0 GiB 479 GiB 82.46 1.23 153 up osd.16 19 hdd 1.39999 1.00000 1.7 TiB 1.1 TiB 1.1 TiB 41 MiB 2.8 GiB 616 GiB 64.96 0.97 78 up osd.19 23 hdd 2.00000 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 96 MiB 5.3 GiB 1.1 TiB 59.58 0.89 111 up osd.23 24 hdd 1.71660 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 52 MiB 3.3 GiB 334 GiB 81.02 1.21 97 up osd.24 28 hdd 2.79999 1.00000 2.7 TiB 2.2 TiB 2.2 TiB 143 MiB 6.2 GiB 440 GiB 83.90 1.26 155 up osd.28 31 hdd 2.67029 1.00000 2.7 TiB 2.2 TiB 2.2 TiB 116 MiB 6.0 GiB 523 GiB 80.87 1.21 149 up osd.31 -4 20.40346 - 25 TiB 16 TiB 16 TiB 932 MiB 46 GiB 8.2 TiB 66.48 1.00 - host ceph03 5 hdd 1.71660 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 61 MiB 2.8 GiB 363 GiB 79.37 1.19 95 up osd.5 6 hdd 1.71660 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 67 MiB 2.9 GiB 329 GiB 81.26 1.22 97 up osd.6 10 hdd 2.50000 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 123 MiB 5.9 GiB 714 GiB 73.89 1.11 139 up osd.10 15 hdd 2.29999 1.00000 2.7 TiB 1.8 TiB 1.8 TiB 102 MiB 5.7 GiB 860 GiB 68.55 1.03 127 up osd.15 17 hdd 1.39999 1.00000 1.6 TiB 1.1 TiB 1.1 TiB 48 MiB 2.5 GiB 524 GiB 68.42 1.02 78 up osd.17 20 hdd 1.59999 1.00000 1.7 TiB 1.3 TiB 1.3 TiB 52 MiB 3.0 GiB 444 GiB 74.73 1.12 90 up osd.20 21 hdd 2.00000 1.00000 2.7 TiB 1.6 TiB 1.6 TiB 103 MiB 5.4 GiB 1.1 TiB 59.96 0.90 112 up osd.21 25 hdd 1.00000 1.00000 1.7 TiB 847 GiB 845 GiB 45 MiB 2.2 GiB 911 GiB 48.20 0.72 57 up osd.25 26 hdd 2.50000 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 126 MiB 5.7 GiB 719 GiB 73.70 1.10 138 up osd.26 27 hdd 1.00000 1.00000 2.7 TiB 820 GiB 816 GiB 61 MiB 3.6 GiB 1.9 TiB 29.97 0.45 56 up osd.27 29 hdd 2.67029 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 145 MiB 6.0 GiB 540 GiB 80.24 1.20 149 up osd.29 -11 12.96458 - 20 TiB 11 TiB 11 TiB 581 MiB 28 GiB 9.0 TiB 54.04 0.81 - host ceph04 34 hdd 1.00000 1.00000 2.7 TiB 819 GiB 815 GiB 70 MiB 3.6 GiB 1.9 TiB 29.96 0.45 55 up osd.34 35 hdd 1.00000 1.00000 2.7 TiB 860 GiB 857 GiB 59 MiB 3.3 GiB 1.8 TiB 31.46 0.47 57 up osd.35 44 hdd 1.00000 1.00000 2.7 TiB 856 GiB 853 GiB 56 MiB 3.6 GiB 1.8 TiB 31.31 0.47 57 up osd.44 45 hdd 1.00000 1.00000 2.7 TiB 816 GiB 812 GiB 65 MiB 3.6 GiB 1.9 TiB 29.84 0.45 55 up osd.45 48 hdd 1.71660 1.00000 1.7 TiB 1.4 TiB 1.4 TiB 50 MiB 2.7 GiB 330 GiB 81.21 1.22 95 up osd.48 51 hdd 3.62399 1.00000 3.6 TiB 3.0 TiB 2.9 TiB 143 MiB 5.3 GiB 685 GiB 81.55 1.22 201 up osd.51 52 hdd 3.62399 1.00000 3.6 TiB 3.0 TiB 3.0 TiB 138 MiB 5.4 GiB 644 GiB 82.64 1.24 204 up osd.52 -13 11.24799 - 16 TiB 9.2 TiB 9.2 TiB 490 MiB 22 GiB 6.6 TiB 58.02 0.87 - host ceph05 46 hdd 1.00000 1.00000 2.6 TiB 856 GiB 853 GiB 56 MiB 3.1 GiB 1.7 TiB 32.48 0.49 57 up osd.46 47 hdd 1.00000 1.00000 2.6 TiB 857 GiB 854 GiB 86 MiB 3.4 GiB 1.7 TiB 32.51 0.49 57 up osd.47 49 hdd 1.00000 1.00000 1.7 TiB 830 GiB 829 GiB 36 MiB 1.9 GiB 927 GiB 47.24 0.71 56 up osd.49 50 hdd 1.00000 1.00000 1.7 TiB 826 GiB 824 GiB 30 MiB 1.8 GiB 932 GiB 47.00 0.70 55 up osd.50 53 hdd 3.62399 1.00000 3.6 TiB 2.9 TiB 2.9 TiB 159 MiB 5.9 GiB 696 GiB 81.25 1.22 200 up osd.53 54 hdd 3.62399 1.00000 3.6 TiB 2.9 TiB 2.9 TiB 123 MiB 5.7 GiB 691 GiB 81.38 1.22 200 up osd.54 -9 17.12006 - 21 TiB 14 TiB 14 TiB 843 MiB 42 GiB 7.3 TiB 65.39 0.98 - host ceph06 36 hdd 2.50000 1.00000 2.7 TiB 2.0 TiB 2.0 TiB 125 MiB 5.8 GiB 677 GiB 75.24 1.13 140 up osd.36 37 hdd 2.39999 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 120 MiB 5.8 GiB 769 GiB 71.89 1.08 134 up osd.37 38 hdd 2.67029 1.00000 2.7 TiB 2.1 TiB 2.1 TiB 125 MiB 5.9 GiB 551 GiB 79.85 1.20 149 up osd.38 39 hdd 2.39999 1.00000 2.7 TiB 1.9 TiB 1.9 TiB 121 MiB 5.6 GiB 773 GiB 71.74 1.07 134 up osd.39 40 hdd 2.57489 1.00000 2.6 TiB 2.0 TiB 2.0 TiB 91 MiB 5.6 GiB 543 GiB 79.40 1.19 143 up osd.40 41 hdd 1.00000 1.00000 2.6 TiB 850 GiB 846 GiB 85 MiB 3.9 GiB 1.7 TiB 32.23 0.48 57 up osd.41 42 hdd 2.57489 1.00000 2.6 TiB 2.0 TiB 2.0 TiB 90 MiB 5.6 GiB 544 GiB 79.38 1.19 142 up osd.42 43 hdd 1.00000 1.00000 2.6 TiB 844 GiB 840 GiB 87 MiB 3.9 GiB 1.8 TiB 31.99 0.48 57 up osd.43 TOTAL 133 TiB 89 TiB 88 TiB 4.9 GiB 241 GiB 45 TiB 66.79 MIN/MAX VAR: 0.45/1.27 STDDEV: 18.42

3 years, 10 months

2
1
0 0

nfs-ganesha mount hangs every day since upgrade to nautilus

by Marc Roos

After having to revert back to ceph-fuse upgrading to nautilus, I have also that the nfs-ganesha mount stalls/breaks every day. Probably caused by: 1 clients failing to respond to capability release 2 clients failing to respond to cache pressure 1 MDSs report slow requests How to fix this?

3 years, 10 months

1
0
0 0

speed up individual backfills

by Thomas Bennett

Hi, I have 15628 misplaced objects that are currently backfilling as follows: 1. pgid:14.3ce1 from:osd.1321 to:osd.3313 2. pgid:14.4dd9 from:osd.1693 to:osd.2980 3. pgid:14.680b from:osd.362 to:osd.3313 These are remnant backfills from a pg-upmap/rebalance campaign after we've added 2 new racks worth of osds to our cluster. Our mon db is bloated so I'm wanting to trim the mon db before continuing the next pg-upmap/rebalance campaign. So, my question is: Is there any way I can speed up the backfill process on these individual osds? Or hints to trace out why these are so slow? Regards

3 years, 10 months

1
1
0 0

ceph orch upgrade stuck at the beginning.

by Gencer W. Genç

Hi, I've 15.2.1 installed on all machines. On primary machine I executed ceph upgrade command: $ ceph orch upgrade start --ceph-version 15.2.2 When I check ceph -s I see this: progress: Upgrade to docker.io/ceph/ceph:v15.2.2 (30m) [=...........................] (remaining: 8h) It says 8 hours. It is already ran for 3 hours. No upgrade processed. It get stuck at this point. Is there any way to know why this has stuck? Thanks, Gencer.

3 years, 10 months

3
19
0 0

Cephadm Setup Query

by Shivanshi .

3 years, 10 months

2
1
0 0

Octopus 15.2.2 unable to make drives available (reject reason locked)...

by Marco Pizzolo

Hello, Hitting an issue with a new 15.2.2 deployment using cephadm. I am having a problem creating encrypted, 2 osds per device OSDs (they are NVMe). After removing and bootstrapping the cluster again, i am unable to create OSDs as they're locked. sgdisk, wipefs, zap all fail to leave the drives as available. Any help would be appreciated. Any comments on performance experiences with ceph in containers (cephadm deployed) vs bare metal (ceph-deploy) would be greatly appreciated as well. Thanks, Marco ceph orch device ls HOST PATH TYPE SIZE DEVICE AVAIL REJECT REASONS prdhcistonode01 /dev/nvme0n1 ssd 11.6T Micron_9300_MTFDHAL12T8TDR_2006266528D1 False *locked* prdhcistonode01 /dev/nvme1n1 ssd 11.6T Micron_9300_MTFDHAL12T8TDR_2006266534D9 False *locked* prdhcistonode01 /dev/nvme2n1 ssd 953G INTEL SSDPEKKF010T8_BTHH850215GA1P0E False *locked* prdhcistonode01 /dev/nvme3n1 ssd 11.6T Micron_9300_MTFDHAL12T8TDR_200626651473 False *locked* prdhcistonode01 /dev/nvme4n1 ssd 11.6T Micron_9300_MTFDHAL12T8TDR_2006266508FB False * locked* prdhcistonode01 /dev/nvme5n1 ssd 11.6T Micron_9300_MTFDHAL12T8TDR_20062664E6E8 False *locked* prdhcistonode01 /dev/nvme6n1 ssd 11.6T Micron_9300_MTFDHAL12T8TDR_200626653CC0 False * locked* prdhcistonode01 /dev/nvme7n1 ssd 11.6T Micron_9300_MTFDHAL12T8TDR_1939243B797E False * locked* prdhcistonode01 /dev/nvme8n1 ssd 11.6T Micron_9300_MTFDHAL12T8TDR_200626652441 False *locked* lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT nvme2n1 259:0 0 953.9G 0 disk ├─nvme2n1p1 259:1 0 512M 0 part /boot/efi └─nvme2n1p2 259:2 0 953.4G 0 part / nvme3n1 259:3 0 11.7T 0 disk └─ceph--5bd47cae--97b3--4cad--b010--215fd982497b-osd--data--e6045acd--a56d--41d2--a016--b8647b9a717a 253:1 0 11.7T 0 lvm nvme4n1 259:4 0 11.7T 0 disk └─ceph--bf7dbfb4--afe3--4391--9847--08e461bf6247-osd--data--12faafac--b695--4c30--b6d7--7046d8275d9f 253:0 0 11.7T 0 lvm nvme0n1 259:5 0 11.7T 0 disk └─ceph--1a5d8e23--ff7d--44c3--b6d2--de143fed2b7d-osd--block--b6593547--e99a--4add--8edd--5d0fb53254cd 253:2 0 11.7T 0 lvm nvme5n1 259:6 0 11.7T 0 disk └─ceph--7d85ff24--79c8--4792--a2c8--bb4908f77ff0-osd--data--fc4e9dbd--920f--41b8--8467--74e9dcbd57ca 253:3 0 11.7T 0 lvm nvme6n1 259:7 0 11.7T 0 disk └─ceph--d8c8652a--1cd8--4e10--a333--4ea10f3b5004-osd--data--9a70a549--3cba--4f0d--a13a--8465781a10e9 253:5 0 11.7T 0 lvm nvme8n1 259:8 0 11.7T 0 disk └─ceph--e1914f1c--2385--4c0c--9951--d4b9200b7164-osd--data--8876559c--6393--4fbc--821b--7ac74cfb5a54 253:7 0 11.7T 0 lvm nvme7n1 259:9 0 11.7T 0 disk └─ceph--3765b53a--75eb--489e--97e1--d6b03bc25532-osd--data--777638e0--a325--401d--a01d--459676871003 253:4 0 11.7T 0 lvm nvme1n1 259:10 0 11.7T 0 disk └─ceph--2124f206--2b50--41a1--8a3c--d47c1a909a3b-osd--block--88e4f1eb--73f4--4c83--b978--fe7cabc0c3e6 253:6 0 11.7T 0 lvm

3 years, 10 months

2
2
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2020