May 2021 - ceph-users - lists.ceph.io

How can I get tail information a parted rados object

by by morphin

Hello. I'm trying to export objects from rados with rados get. Some objects bigger than 4M and they have tails. Is there any easy way to get tail information an object? For example this is an object: - c106b26b.3_Img/2017/12/im034113.jpg These are the objet parts: - c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1 - c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_1 - c106b26b.3__shadow_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.1_2 - c106b26b.3__multipart_Img/2017/12/im034113.jpg.2~fjrC5r_KCWMBat_4bFVtmBv9pxcVL-9.2 As you can see the object has 2 multipart and 2 shadow object. This jpg only works when I get all the parts and make it one with the order. order: "cat 9.1 9.1_1 9.1_2 9.2 > im034113.jpg" I'm trying to write a code and the code gonna read objects from a list and find all the parts, bring it together with the order... But I couldn't find a good way to get part information. I followed the link https://www.programmersought.com/article/31497869978/ and I get the object manifest with getxattr and decode it with "ceph-dencoder type RGWBucketEnt decode dump_json" But in the manifest I can not find a path to code it. It's not useful. Is there any different place that I can take the part information an object? Or better! Is there any tool to export an object with its tails? btw: these objects created by RGW using s3. RGW can not access these files. Because of that I'm trying to export it from rados and send it to different RGW.

3 years

2
2
0 0

Troubleshoot MDS failure

by Alessandro Piazza

Dear all, I'm having a hard time troubleshooting a file-system failure on my 3 node cluster (deployed with cephadm + docker). After moving some files between folders, the cluster became laggy and Metadata Servers started failing and got stuck in rejoin state. Of course I already tried to restart the cluster multiple times. The mds units are now in a failed state because of too many restarts, the file-system is degraded and cannot be mounted because no mds is up. I think the data pool is ok because I can get files using rados. I can trigger the standby mds to become the "major" with ceph orch daemon rm mds <mds-in-error-id> or deploy a new one but the new "major" mds go again in error state. I don't find the mds logs really helpful but you can find one in the attachments for someone more expert than me. I am hesitant to follow the guide https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ because of the warnings and because the ceph-journal-tool is poorly documented. The following might be useful seppia:~# ceph fs status starfs - 0 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 rejoin(laggy) starfs.polposition.njarir 539 25 17 0 POOL TYPE USED AVAIL cephfs.starfs.meta metadata 9900M 1027G cephfs.starfs.data data 12.1T 1027G MDS version: ceph version 16.2.0 (0c2054e95bcd9b30fdd908a79ac1d8bbc3394442) pacific (stable) seppia:~ # ceph health detail HEALTH_WARN 2 failed cephadm daemon(s); 1 filesystem is degraded; insufficient standby MDS daemons available; 7 pgs not deep-scrubbed in time [WRN] CEPHADM_FAILED_DAEMON: 2 failed cephadm daemon(s) daemon mds.starfs.polposition.njarir on polposition.starfleet.sns.it is in error state daemon mds.starfs.seppia.wdwrho on seppia.starfleet.sns.it is in error state [WRN] FS_DEGRADED: 1 filesystem is degraded fs starfs is degraded [WRN] MDS_INSUFFICIENT_STANDBY: insufficient standby MDS daemons available have 0; want 1 more [WRN] PG_NOT_DEEP_SCRUBBED: 7 pgs not deep-scrubbed in time pg 3.a8 not deep-scrubbed since 2021-04-20T20:07:48.346677+0000 pg 3.a2 not deep-scrubbed since 2021-04-21T08:10:55.220263+0000 pg 3.7 not deep-scrubbed since 2021-04-21T07:24:20.073569+0000 pg 2.0 not deep-scrubbed since 2021-04-21T05:01:18.439456+0000 pg 9.1a not deep-scrubbed since 2021-04-21T05:18:20.171151+0000 pg 3.1cb not deep-scrubbed since 2021-04-20T21:54:38.251349+0000 pg 3.1ef not deep-scrubbed since 2021-04-21T07:07:18.842132+0000 Thanks for any suggestions, Alessandro Piazza

3 years

1
0
0 0

Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

by Ilya Dryomov

On Mon, May 3, 2021 at 12:00 PM Magnus Harlander <magnus(a)harlan.de> wrote: > > Am 03.05.21 um 11:22 schrieb Ilya Dryomov: > > max_osd 12 > > I never had more then 10 osds on the two osd nodes of this cluster. > > I was running a 3 osd-node cluster earlier with more than 10 > osds, but the current cluster has been setup from scratch and > I definitely don't remember having ever more than 10 osds! > Very strange! > > I had to replace 2 disks because of DOA-Problems, but for that > I removed 2 osds and created new ones. > > I used ceph-deploy do create new osds. > > To delete osd.8 I used: > > # take it out > ceph osd out 8 > > # wait for rebalancing to finish > > systemctl stop ceph-osd@8 > > # wait for a healthy cluster > > ceph osd purge 8 --yes-i-really-mean-it > > # edit ceph.conf and remove osd.8 > > ceph-deploy --overwrie-conf admin s0 s1 > > # Add the new disk and: > ceph-deploy osd create --data /dev/sdc s0 > ... > > it get's created with the next free osd num (8) because purge releases 8 for reuse It would be nice to track it down, but for the immediate issue of kernel 5.11 not working, "ceph osd setmaxosd 10" should fix it. Thanks, Ilya

3 years

1
0
0 0

Cannot create issue in bugtracker

by Tobias Urdin

Hello, Is it only me that's getting Internal error when trying to create issues in the bugtracker for some day or two? https://tracker.ceph.com/issues/new Best regards

3 years

2
2
0 0

Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

by Ilya Dryomov

On Mon, May 3, 2021 at 9:20 AM Magnus Harlander <magnus(a)harlan.de> wrote: > > Am 03.05.21 um 00:44 schrieb Ilya Dryomov: > > On Sun, May 2, 2021 at 11:15 PM Magnus Harlander <magnus(a)harlan.de> wrote: > > Hi, > > I know there is a thread about problems with mounting cephfs with 5.11 kernels. > > ... > > Hi Magnus, > > What is the output of "ceph config dump"? > > Instead of providing those lines, can you run "ceph osd getmap 64281 -o > osdmap.64281" and attach osdmap.64281 file? > > Thanks, > > Ilya > > Hi Ilya, > > [root@s1 ~]# ceph config dump > WHO MASK LEVEL OPTION VALUE RO > global basic device_failure_prediction_mode local > global advanced ms_bind_ipv4 false > mon advanced auth_allow_insecure_global_id_reclaim false > mon advanced mon_lease 8.000000 > mgr advanced mgr/devicehealth/enable_monitoring true > > getmap output is attached, I see the problem, but I don't understand the root cause yet. It is related to the two missing OSDs: > May 02 22:54:05 islay kernel: libceph: no match of type 1 in addrvec > May 02 22:54:05 islay kernel: libceph: corrupt full osdmap (-2) epoch 64281 off 3154 (00000000a90fe1d7 of 000000000083f4bd-00000000c03bdc9b) > max_osd 12 > osd.0 up in ... [v2:192.168.200.141:6804/3027,v1:192.168.200.141:6805/3027] ... exists,up 631bc170-45fd-4948-9a5e-4c278569c0bc > osd.1 up in ... [v2:192.168.200.140:6811/3066,v1:192.168.200.140:6813/3066] ... exists,up 660a762c-001d-4160-a9ee-d0acd078e776 > osd.2 up in ... [v2:192.168.200.141:6815/3008,v1:192.168.200.141:6816/3008] ... exists,up e4d94d3a-ec58-46a1-b61c-c47dd39012ed > osd.3 up in ... [v2:192.168.200.140:6800/3067,v1:192.168.200.140:6801/3067] ... exists,up 26d25060-fd99-4d15-a1b2-ebb77646671e > osd.4 up in ... [v2:192.168.200.140:6804/3049,v1:192.168.200.140:6806/3049] ... exists,up 238f197d-ecbc-4588-8a99-6a63c9bb1a17 > osd.5 up in ... [v2:192.168.200.140:6816/3073,v1:192.168.200.140:6817/3073] ... exists,up a9dcb26f-0f1c-4067-a26b-a29939285e0b > osd.6 up in ... [v2:192.168.200.141:6808/3020,v1:192.168.200.141:6809/3020] ... exists,up f399b47d-063f-4b2f-bd93-289377dc9945 > osd.7 up in ... [v2:192.168.200.141:6800/3023,v1:192.168.200.141:6801/3023] ... exists,up 3557ceca-7bd8-401e-abd3-59bee168e8f6 > osd.8 up in ... [v2:192.168.200.141:6812/3017,v1:192.168.200.141:6813/3017] ... exists,up 7f9cad3f-163d-4bb7-85b2-fffd46982fff > osd.9 up in ... [v2:192.168.200.140:6805/3053,v1:192.168.200.140:6807/3053] ... exists,up c543b12a-f9bf-4b83-af16-f6b8a3926e69 The kernel client is failing to parse addrvec entries for non-existent osd10 and osd11. It is probably being too stringent, but before fixing it I'd like to understand what happened to those OSDs. It looks like they were removed but not completely. What let to their removal? What commands were used? Thanks, Ilya

3 years

1
0
0 0

global multipart lc policy in radosgw

by Boris Behrens

Hi, I have a lot of multipart uploads that look like they never finished. Some of them date back to 2019. Is there a way to clean them up when they didn't finish in 28 days? I know I can implement a LC policy per bucket, but how do I implement it cluster wide? Cheers Boris -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal.

3 years

2
1
0 0

cephfs mount problems with 5.11 kernel - not a ipv6 problem

by Magnus Harlander

Hi, I know there is a thread about problems with mounting cephfs with 5.11 kernels. I tried everything that's mentioned there, but I still can not mount a cephfs from an octopus node. I verified: - I can not mount with 5.11 client kernels (fedora 33 and ubuntu 21.04) - I can mount with 5.10 client kernels - It is not due to ipv4/ipv6. I'm not using ipv6 - I'm using a cluster network on a private network segment. Because this was mentioned as a possible cause for the problems (next to ipv6) I removed the cluster network and now I'm using the same network for osd syncs and client connections. It did not help. - mount returns with a timeout and error after about 1 minute - I tried the ms_mode=legacy (and others) mount options. Nothing helped - I tried to use IP:PORT:/fs to mount to exclude DNS as the cause. Didn't help. - I did setup a similar test cluster on a few VMs and did not have a problem with mouting. Even used cluster networks, which also worked fine. I'm running out of ideas? Any help would be appreciated. \Magnus My Setup: SERVER OS: ========== [root@s1 ~]# hostnamectl Static hostname: s1.harlan.de Icon name: computer-desktop Chassis: desktop Machine ID: 3a0a6308630842ffad6b9bb8be4c7547 Boot ID: ffb2948d3934419dafceb0990316d9fd Operating System: CentOS Linux 8 CPE OS Name: cpe:/o:centos:centos:8 Kernel: Linux 4.18.0-240.22.1.el8_3.x86_64 Architecture: x86-64 CEPH VERSION: ============= ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable) CLIENT OS: ========== [root@islay ~]# hostnamectl Static hostname: islay Icon name: computer-laptop Chassis: laptop Machine ID: 6de7b27dfd864e9ea52b8b0cff47cdfc Boot ID: 6d8d8bb36f274458b2b761b0a046c8ad Operating System: Fedora 33 (Workstation Edition) CPE OS Name: cpe:/o:fedoraproject:fedora:33 Kernel: Linux 5.11.16-200.fc33.x86_64 Architecture: x86-64 CEPH VERSION: ============= [root@islay harlan]# ceph version ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable) [root@s1 ~]# ceph version ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable) FSTAB ENTRY: ============ cfs0,cfs1:/fs /data/fs ceph rw,_netdev,name=admin,secretfile=/etc/ceph/fs.secret 0 0 IP CONFIG MON/OSD NODE (s1) ======================= [root@s1 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: enp4s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000 link/ether 98:de:d0:04:26:86 brd ff:ff:ff:ff:ff:ff 3: enp5s0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000 link/ether a8:a1:59:18:e7:ea brd ff:ff:ff:ff:ff:ff 4: vmbr: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 98:de:d0:04:26:86 brd ff:ff:ff:ff:ff:ff inet 192.168.200.111/24 brd 192.168.200.255 scope global noprefixroute vmbr valid_lft forever preferred_lft forever inet 192.168.200.141/24 brd 192.168.200.255 scope global secondary noprefixroute vmbr valid_lft forever preferred_lft forever inet 192.168.200.101/24 brd 192.168.200.255 scope global secondary vmbr valid_lft forever preferred_lft forever inet6 fe80::be55:705d:7c9e:eaa4/64 scope link noprefixroute valid_lft forever preferred_lft forever 5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr state UP group default qlen 1000 link/ether 98:de:d0:04:26:86 brd ff:ff:ff:ff:ff:ff 6: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000 link/ether 52:54:00:32:ea:2f brd ff:ff:ff:ff:ff:ff inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0 valid_lft forever preferred_lft forever 7: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000 link/ether 52:54:00:32:ea:2f brd ff:ff:ff:ff:ff:ff 8: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master vmbr state UNKNOWN group default qlen 1000 link/ether fe:54:00:67:4d:15 brd ff:ff:ff:ff:ff:ff inet6 fe80::fc54:ff:fe67:4d15/64 scope link valid_lft forever preferred_lft forever CEPH STATUS: ============ [root@s1 ~]# ceph -s cluster: id: 86bbd6c5-ae96-4c78-8a5e-50623f0ae524 health: HEALTH_OK services: mon: 4 daemons, quorum s0,mbox,s1,r1 (age 6h) mgr: s1(active, since 6h), standbys: s0 mds: fs:1 {0=s1=up:active} 1 up:standby osd: 10 osds: 10 up (since 6h), 10 in (since 6h) data: pools: 6 pools, 289 pgs objects: 1.75M objects, 1.6 TiB usage: 3.3 TiB used, 13 TiB / 16 TiB avail pgs: 289 active+clean io: client: 0 B/s rd, 245 KiB/s wr, 0 op/s rd, 4 op/s wr CEPH OSD TREE: ============== [root@s1 ~]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 16.99994 root default -9 8.39996 host s0 1 hdd 4.00000 osd.1 up 1.00000 1.00000 5 hdd 1.79999 osd.5 up 1.00000 1.00000 9 hdd 1.79999 osd.9 up 1.00000 1.00000 3 ssd 0.50000 osd.3 up 1.00000 1.00000 4 ssd 0.29999 osd.4 up 1.00000 1.00000 -12 8.59998 host s1 6 hdd 1.79999 osd.6 up 1.00000 1.00000 7 hdd 1.79999 osd.7 up 1.00000 1.00000 8 hdd 4.00000 osd.8 up 1.00000 1.00000 0 ssd 0.50000 osd.0 up 1.00000 1.00000 2 ssd 0.50000 osd.2 up 1.00000 1.00000 CEPH MON STAT: ============== [root@s1 ~]# ceph mon stat e19: 4 mons at {mbox=[v2:192.168.200.5:3300/0,v1:192.168.200.5:6789/0],r1=[v2:192.168.200.113:3300/0,v1:192.168.200.113:6789/0],s0=[v2:192.168.200.110:3300/0,v1:192.168.200.110:6789/0],s1=[v2:192.168.200.111:3300/0,v1:192.168.200.111:6789/0]}, election epoch 8618, leader 0 s0, quorum 0,1,2,3 s0,mbox,s1,r1 CEPH FS DUMP: ============= [root@s1 ~]# ceph fs dump dumped fsmap epoch 15534 e15534 enable_multiple, ever_enabled_multiple: 0,0 compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 2 Filesystem 'fs' (2) fs_name fs epoch 15534 flags 12 created 2021-02-02T18:47:25.306744+0100 modified 2021-05-02T16:33:36.738341+0200 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 min_compat_client 0 (unknown) last_failure 0 last_failure_osd_epoch 64252 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 1 in 0 up {0=54782953} failed damaged stopped data_pools [10] metadata_pool 11 inline_data disabled balancer standby_count_wanted 1 [mds.s1{0:54782953} state up:active seq 816 addr [v2:192.168.200.111:6800/1895356761,v1:192.168.200.111:6801/1895356761]] Standby daemons: [mds.s0{-1:54958514} state up:standby seq 1 addr [v2:192.168.200.110:6800/297471268,v1:192.168.200.110:6801/297471268]] CEPH CONF: ========== [root@s1 ~]# cat /etc/ceph/ceph.conf [global] fsid = 86bbd6c5-ae96-4c78-8a5e-50623f0ae524 mon_initial_members = s0, s1, mbox, r1 mon_host = 192.168.200.110,192.168.200.111,192.168.200.5,192.168.200.113 ms_bind_ipv4 = true ms_bind_ipv6 = false auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx public network = 192.168.200.0/24 [osd] public network = 192.168.200.0/24 osd_memory_target = 2147483648 osd crush update on start = false [osd.1] public addr = 192.168.200.140 osd_memory_target = 2147483648 [osd.3] public addr = 192.168.200.140 osd_memory_target = 2147483648 [osd.4] public addr = 192.168.200.140 osd_memory_target = 2147483648 [osd.5] public addr = 192.168.200.140 osd_memory_target = 2147483648 [osd.9] public addr = 192.168.200.140 osd_memory_target = 2147483648 [osd.0] public addr = 192.168.200.141 osd_memory_target = 2147483648 [osd.2] public addr = 192.168.200.141 osd_memory_target = 2147483648 [osd.6] public addr = 192.168.200.141 osd_memory_target = 2147483648 [osd.7] public addr = 192.168.200.141 osd_memory_target = 2147483648 [osd.8] public addr = 192.168.200.141 osd_memory_target = 2147483648 CEPH FS STAT ============ [root@s1 ~]# ceph fs status fs - 0 clients == RANK STATE MDS ACTIVITY DNS INOS 0 active s1 Reqs: 0 /s 0 0 POOL TYPE USED AVAIL cfs_md metadata 2365M 528G cfs data 2960G 4967G STANDBY MDS s0 VERSION DAEMONS None s1 ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus (stable) s0 CLIENT JOURNALCTL WHEN MOUNTING =============================== May 02 22:54:04 islay kernel: FS-Cache: Loaded May 02 22:54:05 islay kernel: Key type ceph registered May 02 22:54:05 islay kernel: libceph: loaded (mon/osd proto 15/24) May 02 22:54:05 islay kernel: FS-Cache: Netfs 'ceph' registered for caching May 02 22:54:05 islay kernel: ceph: loaded (mds proto 32) May 02 22:54:05 islay kernel: libceph: mon1 (1)192.168.200.111:6789 session established May 02 22:54:05 islay kernel: libceph: mon1 (1)192.168.200.111:6789 socket closed (con state OPEN) May 02 22:54:05 islay kernel: libceph: mon1 (1)192.168.200.111:6789 session lost, hunting for new mon May 02 22:54:05 islay kernel: libceph: mon0 (1)192.168.200.5:6789 session established May 02 22:54:05 islay kernel: libceph: no match of type 1 in addrvec May 02 22:54:05 islay kernel: libceph: corrupt full osdmap (-2) epoch 64281 off 3154 (00000000a90fe1d7 of 000000000083f4bd-00000000c03bdc9b) May 02 22:54:05 islay kernel: osdmap: 00000000: 08 07 4f 24 00 00 09 01 9e 12 00 00 86 bb d6 c5 ..O$............ May 02 22:54:05 islay kernel: osdmap: 00000010: ae 96 4c 78 8a 5e 50 62 3f 0a e5 24 19 fb 00 00 ..Lx.^Pb?..$.... May 02 22:54:05 islay kernel: osdmap: 00000020: 54 f0 53 5d 3a fd ae 0e 1b 07 8f 60 b3 8e d2 2f T.S]:......`.../ May 02 22:54:05 islay kernel: osdmap: 00000030: 06 00 00 00 02 00 00 00 00 00 00 00 1d 05 44 01 ..............D. May 02 22:54:05 islay kernel: osdmap: 00000040: 00 00 01 02 02 02 20 00 00 00 20 00 00 00 00 00 ...... ... ..... May 02 22:54:05 islay kernel: osdmap: 00000050: 00 00 00 00 00 00 5e fa 00 00 2e 04 00 00 00 00 ......^......... May 02 22:54:05 islay kernel: osdmap: 00000060: 00 00 5e fa 00 00 00 00 00 00 00 00 00 00 00 00 ..^............. ..... many more lines, i can provide if they are useful. CEPH OSDMAP: ============ epoch 64281 fsid 86bbd6c5-ae96-4c78-8a5e-50623f0ae524 created 2019-08-14T13:28:20.246349+0200 modified 2021-05-02T22:10:03.802328+0200 flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 140 full_ratio 0.92 backfillfull_ratio 0.9 nearfull_ratio 0.88 require_min_compat_client jewel min_compat_client jewel require_osd_release octopus pool 2 'vms' replicated size 2 min_size 1 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 64094 lfor 0/62074/62072 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd pool 8 'ssdpool' replicated size 2 min_size 1 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 61436 lfor 0/61436/61434 flags hashpspool stripe_width 0 pool 9 'hddpool' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 61413 lfor 0/61413/61411 flags hashpspool stripe_width 0 pool 10 'cfs' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode on last_change 63328 flags hashpspool,selfmanaged_snaps stripe_width 0 application cephfs pool 11 'cfs_md' replicated size 2 min_size 1 crush_rule 2 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode on last_change 63332 flags hashpspool stripe_width 0 application cephfs pool 12 'device_health_metrics' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode off last_change 64255 flags hashpspool stripe_width 0 application mgr_devicehealth max_osd 12 osd.0 up in weight 1 up_from 64236 up_thru 64263 down_at 64233 last_clean_interval [64211,64231) [v2:192.168.200.141:6804/3027,v1:192.168.200.141:6805/3027] [v2:192.168.200.111:6806/3027,v1:192.168.200.111:6807/3027] exists,up 631bc170-45fd-4948-9a5e-4c278569c0bc osd.1 up in weight 1 up_from 64259 up_thru 64260 down_at 64249 last_clean_interval [64223,64248) [v2:192.168.200.140:6811/3066,v1:192.168.200.140:6813/3066] [v2:192.168.200.110:6813/3066,v1:192.168.200.110:6815/3066] exists,up 660a762c-001d-4160-a9ee-d0acd078e776 osd.2 up in weight 1 up_from 64236 up_thru 64266 down_at 64233 last_clean_interval [64211,64231) [v2:192.168.200.141:6815/3008,v1:192.168.200.141:6816/3008] [v2:192.168.200.111:6816/3008,v1:192.168.200.111:6817/3008] exists,up e4d94d3a-ec58-46a1-b61c-c47dd39012ed osd.3 up in weight 1 up_from 64256 up_thru 64264 down_at 64249 last_clean_interval [64221,64248) [v2:192.168.200.140:6800/3067,v1:192.168.200.140:6801/3067] [v2:192.168.200.110:6802/3067,v1:192.168.200.110:6803/3067] exists,up 26d25060-fd99-4d15-a1b2-ebb77646671e osd.4 up in weight 1 up_from 64256 up_thru 64264 down_at 64249 last_clean_interval [64221,64248) [v2:192.168.200.140:6804/3049,v1:192.168.200.140:6806/3049] [v2:192.168.200.110:6806/3049,v1:192.168.200.110:6807/3049] exists,up 238f197d-ecbc-4588-8a99-6a63c9bb1a17 osd.5 up in weight 1 up_from 64260 up_thru 64260 down_at 64249 last_clean_interval [64226,64248) [v2:192.168.200.140:6816/3073,v1:192.168.200.140:6817/3073] [v2:192.168.200.110:6818/3073,v1:192.168.200.110:6819/3073] exists,up a9dcb26f-0f1c-4067-a26b-a29939285e0b osd.6 up in weight 1 up_from 64240 up_thru 64260 down_at 64233 last_clean_interval [64218,64231) [v2:192.168.200.141:6808/3020,v1:192.168.200.141:6809/3020] [v2:192.168.200.111:6810/3020,v1:192.168.200.111:6811/3020] exists,up f399b47d-063f-4b2f-bd93-289377dc9945 osd.7 up in weight 1 up_from 64238 up_thru 64260 down_at 64233 last_clean_interval [64214,64231) [v2:192.168.200.141:6800/3023,v1:192.168.200.141:6801/3023] [v2:192.168.200.111:6802/3023,v1:192.168.200.111:6803/3023] exists,up 3557ceca-7bd8-401e-abd3-59bee168e8f6 osd.8 up in weight 1 up_from 64242 up_thru 64260 down_at 64233 last_clean_interval [64216,64231) [v2:192.168.200.141:6812/3017,v1:192.168.200.141:6813/3017] [v2:192.168.200.111:6814/3017,v1:192.168.200.111:6815/3017] exists,up 7f9cad3f-163d-4bb7-85b2-fffd46982fff osd.9 up in weight 1 up_from 64257 up_thru 64257 down_at 64249 last_clean_interval [64229,64248) [v2:192.168.200.140:6805/3053,v1:192.168.200.140:6807/3053] [v2:192.168.200.110:6808/3053,v1:192.168.200.110:6809/3053] exists,up c543b12a-f9bf-4b83-af16-f6b8a3926e69 blacklist 192.168.200.110:0/3803039218 expires 2021-05-03T15:33:52.837358+0200 blacklist 192.168.200.111:6800/3725740504 expires 2021-05-03T15:37:38.953040+0200 blacklist 192.168.200.110:6822/3464419 expires 2021-05-03T15:56:28.124585+0200 blacklist 192.168.200.110:6801/838484672 expires 2021-05-03T15:56:13.108594+0200 blacklist 192.168.200.110:6800/838484672 expires 2021-05-03T15:56:13.108594+0200 blacklist 192.168.200.111:6841/159804987 expires 2021-05-03T14:54:05.413130+0200 blacklist 192.168.200.111:6840/159804987 expires 2021-05-03T14:54:05.413130+0200 blacklist 192.168.200.111:6801/3725740504 expires 2021-05-03T15:37:38.953040+0200 blacklist 192.168.200.110:6807/453197 expires 2021-05-03T15:33:52.837358+0200 blacklist 192.168.200.5:6801/3078236863 expires 2021-05-03T14:38:57.694004+0200 blacklist 192.168.200.110:0/1948864559 expires 2021-05-03T15:33:52.837358+0200 blacklist 192.168.200.111:6800/3987205903 expires 2021-05-03T15:32:12.633802+0200 blacklist 192.168.200.111:6800/2342337613 expires 2021-05-03T14:46:57.936272+0200 blacklist 192.168.200.110:0/3020995128 expires 2021-05-03T15:56:28.124585+0200 blacklist 192.168.200.5:6800/3078236863 expires 2021-05-03T14:38:57.694004+0200 blacklist 192.168.200.110:0/2607867017 expires 2021-05-03T15:33:52.837358+0200 blacklist 192.168.200.111:6801/3987205903 expires 2021-05-03T15:32:12.633802+0200 blacklist 192.168.200.110:0/3159222459 expires 2021-05-03T15:56:28.124585+0200 blacklist 192.168.200.110:6806/453197 expires 2021-05-03T15:33:52.837358+0200 blacklist 192.168.200.110:6823/3464419 expires 2021-05-03T15:56:28.124585+0200 blacklist 192.168.200.111:6801/2342337613 expires 2021-05-03T14:46:57.936272+0200 blacklist 192.168.200.111:6800/2205788037 expires 2021-05-03T14:56:56.448631+0200 blacklist 192.168.200.111:6801/2205788037 expires 2021-05-03T14:56:56.448631+0200 -- Dr. Magnus Harlander Mail: harlan(a)harlan.de Web: www.harlan.de Stiftung: www.harlander-stiftung.de Ceterum censeo bitcoin esse delendam!

3 years

2
1
0 0

How radosgw works ?

by Fabrice Bacchella

I'm trying to understand what and where radosgw listen ? There is a lot of contradictory or redundant informations about that. First about the contradictory informations for the socket. At https://docs.ceph.com/en/pacific/radosgw/config-ref/ <https://docs.ceph.com/en/pacific/radosgw/config-ref/>, it says rgw_socket_path, but at https://docs.ceph.com/en/pacific/man/8/radosgw/ <https://docs.ceph.com/en/pacific/man/8/radosgw/> is says 'rgw socket path' That problem is quite common in the ceph documentation. Are both value accepted ? Next about some naming, or binding IP. Where it's defined, and how ? You have: rgw_frontends = "beast ssl_endpoint=0.0.0.0:443 port=443 ..." rgw_host = rgw_port = rgw_dns_name = That's a lot of redundancy, or contradictory informations. What is the purpose of each one ? What is the difference between rgw_frontends = ".. port = ..." and rgw_port = ? Or rgw_host and rgw_dns_name. What is the difference ? The documentation provides no help at all: rgw_dns_name Description: The DNS name of the served domain. See also the hostnames setting within regions. The description says nothing new, it just repeat the field name. Is one of them used by the manager for communication ? I already had the problem for the entry in the certificate used by the frontend, it used an IP coming from nowhere. If a fcgi is used, how the manager find the endpoint ?

3 years

2
2
0 0

Big OSD add, long backfill, degraded PGs, deep-scrub backlog, OSD restarts

by Dave Hall

Hello, I recently added 2 OSD nodes to my Nautilus cluster, increasing the OSD count from 32 to 48 - all 12TB HDDs with NVMe for db. I generally keep an ssh session open where I can run 'watch cepf -s'. My observations are mostly based on what I saw from watching this. Even with 10GB networking, rebalancing 529 pgs took 10 days, during which there were always a few PGs undersized+degraded, frequent flashes of slow ops, occasional OSD restarts, and the scrub and deep-scrub backlog steadily increased. When the backfills completed I had 24 missed deep-scrubs and 10 missed scrubs. I suspect that this is because of some settings that I had fiddled with, so this post may be an advertisement for what not to do to your cluster. However, I'd like to know if my understanding is accurate. I believe that my settings resulted In short, I think I had my config set up so there was contention due to too many processes trying to do things to some OSDs all at once: - osd_scrub_during_recovery: I think I had this set to true for the first 9 days, but set it to false when I started to realize that it might be causing contention - osd_max_scrubs: I had this set high - global:30 osd:10. At some earlier time when I had a scrub backlog I thought that these were counts for simultaneous scrubs across all OSDs rather than 'per OSD' - Now I see why the default is 1. - Assumption: on an HDD multiple competing scrubs cause excessive seeking and thus compound impacts to scrub progress - osd_max_backfills: I had bumped this up as well - global:30 osd:10, thinking it would speed up the rebalancing of my PGs onto my new OSDs. - Now, the same thinking as for osd_max_scrubs: compounding contention, further compounded by the scrub acivity that should have been inhibited by osd_scrub_during_recovery:false. I believe that all of this also resulted in my EC pgs (8 + 2) becoming degraded. My assumption here is that collisions between deep-scrubs and backfills sometimes locked the backfill process out of a piece of an EC PG, causing backfil to rebuild instead of copy. The good news is that I haven't lost and data and, other than the scrub backlog things seem to be working smoothly. It seems like with 1 or 2 scrubs (deep or regular) running they are taking about 2 hours per scrub. As the scrubs progress, more scrub deadlines are missed, so it's not a steady march to zero. Please feel free to comment. I'd be glad to know if I'm on the right track as we expect the cluster to double in size over the next 12 to 18 months. Thanks. -Dave -- Dave Hall Binghamton University kdhall(a)binghamton.edu

3 years

1
0
0 0

Best distro to run ceph.

by Peter Childs

I'm trying to set up a new ceph cluster, and I've hit a bit of a blank. I started off with centos7 and cephadm. Worked fine to a point, except I had to upgrade podman but it mostly worked with octopus. Since this is a fresh cluster and hence no data at risk, I decided to jump straight into Pacific when it came out and upgrade. Which is where my trouble began. Mostly because Pacific needs a version on lvm later than what's in centos7. I can't upgrade to centos8 as my boot drives are not supported by centos8 due to the way redhst disabled lots of disk drivers. I think I'm looking at Ubuntu or debian. Given cephadm has a very limited set of depends it would be good to have a supported matrix, it would also be good to have a check in cephadm on upgrade, that says no I won't upgrade if the version of lvm2 is too low on any host and let's the admin fix the issue and try again. I was thinking to upgrade to centos8 for this project anyway until I relised that centos8 can't support my hardware I've inherited. But currently I've got a broken cluster unless I can workout some way to upgrade lvm in centos7. Peter.

3 years

3
3
0 0

2024

2023

2022

2021

2020

2019

ceph-users May 2021