April 2020 - ceph-users - lists.ceph.io

librados : handle_auth_bad_method server allowed_methods [2] but i only support [2,1]

by Yoann Moulin

Hello, I have a Nautilus (14.2.8) cluster and I'd like to give access to a pool with librados to a user. Here what I have > # ceph osd pool ls detail | grep user1 > pool 5 'user1' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 108 flags hashpspool max_bytes 1099511627776 stripe_width 0 application user1 > # ceph auth get client.user1 > exported keyring for client.user1> client.user1 > key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX== > caps: [mon] allow r > caps: [osd] allow rw pool=user1 namespace=user1 On the client > $ cat ~/ceph.conf> [global] > mon host = [v2:10.90.36.16:3300,v1:10.90.36.16:6789],[v2:10.90.36.17:3300,v1:10.90.36.17:6789],[v2:10.90.36.18:3300,v1:10.90.36.18:6789] > keyring = ~/user1.keyring > $ cat ~/user1.keyring > XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX== > $ rados -c ~/ceph.conf -p pool ls > 2020-04-02 12:44:59.900 7fd78aea3700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1] > 2020-04-02 12:44:59.900 7fd789ea1700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1] > 2020-04-02 12:44:59.900 7fd78a6a2700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2,1] > failed to fetch mon config (--no-mon-config to skip) Is there something I missed? Thanks for your help. Best regards, -- Yoann Moulin EPFL IC-IT

4 years

1
1
0 0

Logging remove duplicate time

by Marc Roos

I already have the time logged, I do not need it a second time. Mar 31 13:39:59 c01 ceph-mgr: 2020-03-31 13:39:59.518 7f554edc8700 0 log_channel(cluster) log [DBG] : pgmap v672065: 384 pgs: 384 active+clean; I already have the time logged, I do not need it a second time. Mar 31 13:39:59 c01 ceph-mgr: 2020-03-31 13:39:59.518 7f554edc8700 0 log_channel(cluster) log [DBG] : pgmap v672065: 384 pgs: 384 active+clean; I already have the time logged, I do not need it a second time. Mar 31 13:39:59 c01 ceph-mgr: 2020-03-31 13:39:59.518 7f554edc8700 0 log_channel(cluster) log [DBG] : pgmap v672065: 384 pgs: 384 active+clean; I already have the time logged, I do not need it a second time. Mar 31 13:39:59 c01 ceph-mgr: 2020-03-31 13:39:59.518 7f554edc8700 0 log_channel(cluster) log [DBG] : pgmap v672065: 384 pgs: 384 active+clean;

4 years

2
2
0 0

LARGE_OMAP_OBJECTS 1 large omap objects

by Dietmar Rieder

Hi, I'm trying to understand the "LARGE_OMAP_OBJECTS 1 large omap objects" warning for out cephfs metadata pool. It seems that pg 5.26 has a large omap object with > 200k keys [WRN] : Large omap object found. Object: 5:654134d2:::mds0_openfiles.0:head PG: 5.4b2c82a6 (5.26) Key count: 286083 Size (bytes): 14043228 I guess this object is related to the open files on the (cephfs mds0_openfiles.0). But what exactly does it tell me? Is the number of keys the number of currently open files? If yes, this is not matching the sum of open files over all clients obtained with lsof (which is less than 1000). So how can I get rid of this? (Reboot the clients?) Thanks for your help Dietmar -- _________________________________________ D i e t m a r R i e d e r, Mag.Dr. Innsbruck Medical University Biocenter - Institute of Bioinformatics Innrain 80, 6020 Innsbruck Email: dietmar.rieder(a)i-med.ac.at Web: http://www.icbi.at

4 years

2
2
0 0

Re: Netplan bonding configuration

by James McEwan

Hi, I am currently building a 10 node Ceph cluster, each OSD node has 2x 25 Gbit/s nics, and I have 2 TOR switches (mlag not supported). enp179s0f0 -> sw1 enp179s0f1 -> sw2 vlan 323 is used for ‘public network’ vlan 324 is used for ‘cluster network’ My desired configuration is to create two bond interfaces in active-backup mode: bond0 - enp179s0f0.323 (active) - enp179s0f1.323 (backup) bond1 - enp179s0f0.324 (backup) - enp179s0f1.324 (active) This way, the public network will use switch1, and the cluster network will use switch2, under normal operation. I am, however, having an issue implementing this configuration in Ubuntu 18.04 with netplan (see configuration at the end of this post). When I reboot a node with the below netplan configuration, the bond interface is created, but the vlan interfaces are not added to the bond. I see the following errors in the log: systemd-networkd[1641]: enp179s0f0.323: Enslaving by 'bond0’ systemd-networkd[1641]: bond0: Enslaving link 'enp179s0f0.323’ systemd-networkd[1641]: enp179s0f1.323: Enslaving by 'bond0’ systemd-networkd[1641]: bond0: Enslaving link 'enp179s0f1.323’ systemd-networkd[1643]: enp179s0f1.323: Could not join netdev: Operation not permitted systemd-networkd[1643]: enp179s0f1.323: Failed systemd-networkd[1643]: enp179s0f0.323: Could not join netdev: Operation not permitted systemd-networkd[1643]: enp179s0f0.323: Failed If I manually run ’systemctl restart systemd-networkd’ after boot has completed, then the bond is successfully created with the vlan interfaces. Does anybody have a similar configuration working specifically with netplan/networkd? Could you please share your configuration? Netplan config that doesn’t work at boot time: network: version: 2 renderer: networkd ethernets: enp179s0f0: {} enp179s0f1: {} bonds: bond0: dhcp4: false dhcp6: false interfaces: - enp179s0f0.323 - enp179s0f1.323 parameters: mode: active-backup primary: enp179s0f0.323 mii-monitor-interval: 1 addresses: [insert address here] bond1: dhcp4: false dhcp6: false interfaces: - enp179s0f0.324 - enp179s0f1.324 parameters: mode: active-backup primary: enp179s0f1.324 mii-monitor-interval: 1 addresses: [insert address here] vlans: enp179s0f0.323: id: 323 link: enp179s0f0 enp179s0f1.323: id: 323 link: enp179s0f1 enp179s0f0.324: id: 324 link: enp179s0f0 enp179s0f1.324: id: 324 link: enp179s0f1

4 years

5
7
0 0

luminous： osd continue down because of the hearbeattimeout

by linghucongsong

HI! all! Thanks for reading this msg. I hava one ceph cluster installed with ceph V12.2.12. It runs well for about half a year. Last week we add anoher two meachine to this ceph cluster.Then all the osds became unstable. The osd ansync message complain can not hearbeat to eachother.But the network ping with no drop packages and no error packages. I use bond0 for the ceph cluster front and back netwrok.Now I set nodown noout the cluster became stable, but from the log I see a lot for error aysnc message.I have try simple message, It also the smae error. All the osd error like below: NG_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.989469 7f42794da700 0 -- 10.255.255.54:6814/1000006 >> 10.255.255.56:0/7 conn(0x55721e0e5800 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.989557 7f42784d8700 0 -- 10.255.255.54:6819/1000006 >> 10.255.255.52:0/7 conn(0x55721e0e8800 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.989728 7f4278cd9700 0 -- 10.255.255.54:6814/1000006 >> 10.255.255.55:0/7 conn(0x55722973b000 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.989872 7f42794da700 0 -- 10.255.255.54:6819/1000006 >> 10.255.255.55:0/7 conn(0x557225b15000 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.990111 7f42794da700 0 -- 10.255.255.54:6819/1000006 >> 10.255.255.55:0/7 conn(0x557228506000 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.990161 7f42784d8700 0 -- 10.255.255.54:6819/1000006 >> 10.255.255.56:0/7 conn(0x557226320000 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.990196 7f42794da700 0 -- 10.255.255.54:6814/1000006 >> 10.255.255.56:0/7 conn(0x55722650b000 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.991450 7f4278cd9700 0 -- 10.255.255.54:6819/1000006 >> 10.255.255.55:0/7 conn(0x5572298d7800 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.991458 7f42784d8700 0 -- 10.255.255.54:6814/1000006 >> 10.255.255.52:0/7 conn(0x557226f19000 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.991639 7f4278cd9700 0 -- 10.255.255.54:6819/1000006 >> 10.255.255.52:0/7 conn(0x557226867800 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.991798 7f42794da700 0 -- 10.255.255.54:6814/1000006 >> 10.255.255.56:0/7 conn(0x55722a20b000 :6814 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) 2020-04-02 09:59:17.991842 7f42784d8700 0 -- 10.255.255.54:6819/1000006 >> 10.255.255.56:0/7 conn(0x557226869000 :6819 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=1).handle_connect_msg accept replacing existing (lossy) channel (new one lossy=1) The network config: bond0 Link encap:Ethernet HWaddr 6c:92:bf:c2:8e:e5 inet6 addr: fe80::6e92:bfff:fec2:8ee5/64 Scope:Link UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:126155520073 errors:0 dropped:3217298 overruns:0 frame:0 TX packets:133297822313 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:57485747361080 (57.4 TB) TX bytes:71267041966300 (71.2 TB) bond0.38 Link encap:Ethernet HWaddr 6c:92:bf:c2:8e:e5 inet addr:192.168.38.54 Bcast:192.168.38.255 Mask:255.255.255.0 inet6 addr: fe80::6e92:bfff:fec2:8ee5/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:60802363 errors:0 dropped:0 overruns:0 frame:0 TX packets:53614452 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:34857574617 (34.8 GB) TX bytes:23829455266 (23.8 GB) bond0.4000 Link encap:Ethernet HWaddr 6c:92:bf:c2:8e:e5 inet addr:10.255.255.54 Bcast:10.255.255.63 Mask:255.255.255.192 inet6 addr: fe80::6e92:bfff:fec2:8ee5/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:107628762285 errors:0 dropped:0 overruns:0 frame:0 TX packets:96091921746 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:54705054842566 (54.7 TB) TX bytes:68763270985565 (68.7 TB) brq86d8e0ef-fa Link encap:Ethernet HWaddr 26:30:9e:96:7a:71 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:2512246 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:338012546 (338.0 MB) TX bytes:0 (0.0 B) docker0 Link encap:Ethernet HWaddr 02:42:67:87:8d:a7 inet addr:172.17.0.1 Bcast:172.17.255.255 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) eno1 Link encap:Ethernet HWaddr 6c:92:bf:c2:8e:e5 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:62792632732 errors:0 dropped:614221 overruns:0 frame:0 TX packets:66647497482 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:28698305745018 (28.6 TB) TX bytes:35631476375125 (35.6 TB) eno2 Link encap:Ethernet HWaddr 6c:92:bf:c2:8e:e5 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:63362887347 errors:0 dropped:633400 overruns:0 frame:0 TX packets:66650324833 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:28787441616499 (28.7 TB) TX bytes:35635565591656 (35.6 TB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:3272039707 errors:0 dropped:0 overruns:0 frame:0 TX packets:3272039707 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1 RX bytes:753581347467 (753.5 GB) TX bytes:753581347467 (753.5 GB) tap579ce88c-9e Link encap:Ethernet HWaddr fe:16:3e:32:3b:0d inet6 addr: fe80::fc16:3eff:fe32:3b0d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:3322983 errors:0 dropped:0 overruns:0 frame:0 TX packets:3480283 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2221592585 (2.2 GB) TX bytes:1380408263 (1.3 GB) tapa32c35b1-87 Link encap:Ethernet HWaddr fe:16:3e:79:65:9f inet6 addr: fe80::fc16:3eff:fe79:659f/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:22360161 errors:0 dropped:0 overruns:0 frame:0 TX packets:25585406 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:3953751232 (3.9 GB) TX bytes:6985195478 (6.9 GB) vxlan-100 Link encap:Ethernet HWaddr 26:30:9e:96:7a:71 UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1 RX packets:7671113091 errors:0 dropped:0 overruns:0 frame:0 TX packets:6732694121 errors:0 dropped:17713 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2959527236402 (2.9 TB) TX bytes:973862583509 (973.8 GB)

4 years

2
1
0 0

Replace OSD node without remapping PGs

by Nghia Viet Tran

Hi everyone, I'm working on replacing OSDs node with the newer one. The new host has the new hostname and new disk (faster one but the same size with old disk). My plan is - Reweight the OSD to zero to spread all existed data to the rest nodes to keep data availability - set flag noout, norebalance, norecover, nobackfill, destroy the OSD and join the new OSD as the same ID of the old one. By above approach, the cluster will remap PGs of all nodes. Each data will be moved twice times until it reach the new OSD (reweight and join new node as same ID) I also did the other way that only set flags and destroy OSD. But the result is still the same (degraded objects from destroyed osd and misplaced object after joining new osd) Are there any ways to replace the OSD node directly without remapping PGs of the whole cluster? Many thanks! Nghia.

4 years

3
3
0 0

Using sendfile on Ceph FS results in data stuck in client cache

by Mikael Öhman

Hi all, Using sendfile function to write data to cephfs, the data doesn't end up being written. From the client that writes the file, it looks correct at first, but from all other ceph clients, the size is 0 bytes. Re-mounting the filesystem, the data is lost. I didn't see any errors, the data just doesn't get written, as if it's just cached in cephfs client. Writing just an extra byte at the end of the file (without sendfile), it seems to trigger the actual write of all the data. Could someone else confirm if they are also seeing such issue? I'm on ceph 13.2.8, using kernel module for mounting on CentOS7. I've used this sendfile-example for the example below: https://github.com/pijewski/sendfile-example/blob/master/sendfile.c Using a small 27 byte source file. # ls -lh examples/ -rw-r--r-- 1 root c3-staff 27 Mar 24 18:04 src # ./sendfile examples/src examples/dst 27 # ls -lh examples/ ------x--- 1 root c3-staff 27 Mar 24 18:12 dst -rw-r--r-- 1 root c3-staff 27 Mar 24 18:04 src But, directory is still on 27 bytes: # ls -lhd examples drwxr-sr-x 1 root c3-staff 27 Mar 24 18:15 examples and on all other cephfs clients, the file is empty: # ls -lh examples/ ------x--- 1 root c3-staff 0 Mar 24 18:12 dst -rw-r--r-- 1 root c3-staff 27 Mar 24 18:04 src Is this a bug in cephfs, or should I not expect sendfile to work (as it is not posix compliant). There are no error reported from what i can see, and it is 100% reproducible Best regards, Mikael

4 years

2
4
0 0

Re: osd can not start at boot after upgrade to octopus

by Eugen Block

Resending the response back to the list. Zitat von "Lomayani S. Laizer" <lomlaizer(a)gmail.com>: > Hello, > I have been running Nautilus from May last year so this is separate issue > from recent bug > > I think the problem is between systemd and ceph-volume. No any logs hitting > osd logs because osd dont start at all. > starting osd manually works fine (/usr/bin/ceph-osd -f --cluster ceph --id > 29 --setuser ceph --setgroup ceph) > > You can see starting osd just exit with no usable log(RuntimeError: command > returned non-zero exit status: 1) > > Below is the logs of ceph-volume-systemd.log > Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir > --dev /dev/ceph-3811ddf5-02be-40f1-a53e-053131aa5712/osd-bl > ock-3e52d340-5416-46e6-b697-c15ca85f6883 --path /var/lib/ceph/osd/ceph-29 > --no-mon-config > Running command: /bin/ln -snf > /dev/ceph-3811ddf5-02be-40f1-a53e-053131aa5712/osd-block-3e52d340-5416-46e6-b697-c15ca85f6883 > /var/lib/c > eph/osd/ceph-29/block > Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-29/block > Running command: /bin/chown -R ceph:ceph /dev/dm-4 > Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-29 > Running command: /bin/systemctl enable > ceph-volume@lvm-29-3e52d340-5416-46e6-b697-c15ca85f6883 > Running command: /bin/systemctl enable --runtime ceph-osd@29 > stderr: Created symlink /run/systemd/system/ce > [2020-04-01 12:17:31,111][ceph_volume.process][INFO ] stderr > ph-osd.target.wants/ceph-osd(a)33.service → /lib/systemd/system/ceph-osd@. > service. > [2020-04-01 12:17:31,111][ceph_volume.process][INFO ] stderr > ph-osd.target.wants/ceph-osd(a)29.service → /lib/systemd/system/ceph-osd@. > service. > Running command: /bin/systemctl start ceph-osd@29 > stderr: Job for ceph-osd(a)29.service canceled. > --> RuntimeError: command returned non-zero exit status: 1 > > On Wed, Apr 1, 2020 at 1:22 PM Eugen Block <eblock(a)nde.ag> wrote: > >> Hi, >> >> are you hitting [1]? Did you run Nautilus only for a short period of >> time before upgrading to Octopus? >> >> If this doesn't apply to you, can you see anything in the OSD logs >> (/var/log/ceph/ceph-osd.<ID>.log)? >> >> Regards, >> Eugen >> >> [1] https://tracker.ceph.com/issues/44770 >> >> Zitat von "Lomayani S. Laizer" <lomlaizer(a)gmail.com>: >> >> > Hello, >> > I have upgraded nautilus cluster to octopus few days ago. the cluster was >> > running ok and even after to octopus everything was running ok >> > >> > the issue came when i rebooted the servers for updating the kernel. Two >> > servers out of 6 osd's servers osd cant start. No error reported in >> > ceph-volume.log and ceph-volume-systemd.log >> > >> > Starting osd with /usr/bin/ceph-osd -f --cluster ceph --id 30 --setuser >> > ceph --setgroup ceph works just fine. the issue is starting osd in >> systemd >> > >> > ceph-volume-systemd.log >> > >> > 16:36:28,193][systemd][WARNING] failed activating OSD, retries left: 30 >> > [2020-03-31 16:36:28,196][systemd][WARNING] command returned non-zero >> exit >> > status: 1 >> > [2020-03-31 16:36:28,196][systemd][WARNING] failed activating OSD, >> retries >> > left: 30 >> > [2020-03-31 16:41:25,054][systemd][INFO ] raw systemd input received: >> > lvm-28-7f4113c8-c5cf-4f70-9f7a-7a32de9d6587 >> > [2020-03-31 16:41:25,054][systemd][INFO ] raw systemd input received: >> > lvm-30-8a70ad95-1c79-4502-a9a3-d5d7b9df84b6 >> > [2020-03-31 16:41:25,054][systemd][INFO ] raw systemd input received: >> > lvm-31-a8efb7db-686b-4789-a9c4-01442c28577f >> > [2020-03-31 16:41:25,096][systemd][INFO ] parsed sub-command: lvm, extra >> > data: 28-7f4113c8-c5cf-4f70-9f7a-7a32de9d6587 >> > [2020-03-31 16:41:25,096][systemd][INFO ] parsed sub-command: lvm, extra >> > data: 30-8a70ad95-1c79-4502-a9a3-d5d7b9df84b6 >> > [2020-03-31 16:41:25,054][systemd][INFO ] raw systemd input received: >> > lvm-33-7d688fc1-ed7b-45ae-ac0e-7b1787e0b64f >> > [2020-03-31 16:41:25,096][systemd][INFO ] parsed sub-command: lvm, extra >> > data: 31-a8efb7db-686b-4789-a9c4-01442c28577f >> > [2020-03-31 16:41:25,068][systemd][INFO ] raw systemd input received: >> > lvm-29-3e52d340-5416-46e6-b697-c15ca85f6883 >> > [2020-03-31 16:41:25,096][systemd][INFO ] parsed sub-command: lvm, extra >> > data: 33-7d688fc1-ed7b-45ae-ac0e-7b1787e0b64f >> > [2020-03-31 16:41:25,068][systemd][INFO ] raw systemd input received: >> > lvm-32-3841a62d-d6bc-404a-8762-163530b2d5d4 >> > [2020-03-31 16:41:25,096][systemd][INFO ] parsed sub-command: lvm, extra >> > data: 29-3e52d340-5416-46e6-b697-c15ca85f6883 >> > [2020-03-31 16:41:25,096][systemd][INFO ] parsed sub-command: lvm, extra >> > data: 32-3841a62d-d6bc-404a-8762-163530b2d5d4 >> > [2020-03-31 16:41:25,108][ceph_volume.process][INFO ] Running command: >> > /usr/sbin/ceph-volume lvm trigger 29-3e52d340-5416-46e6-b697-c15ca85f6883 >> > >> > ceph-volume.log >> > 2-163530b2d5d4 >> > [2020-03-31 17:17:23,679][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl enable --runtime ceph-osd@31 >> > [2020-03-31 17:17:23,863][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl enable --runtime ceph-osd@30 >> > [2020-03-31 17:17:24,045][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl enable --runtime ceph-osd@33 >> > [2020-03-31 17:17:24,241][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl enable --runtime ceph-osd@32 >> > [2020-03-31 17:17:24,449][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl enable --runtime ceph-osd@28 >> > [2020-03-31 17:17:24,629][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl enable --runtime ceph-osd@29 >> > [2020-03-31 17:17:24,652][ceph_volume.process][INFO ] stderr Created >> > symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd(a)31.service → >> > /lib/systemd/system/ceph-osd@.service. >> > [2020-03-31 17:17:24,664][ceph_volume.process][INFO ] stderr Created >> > symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd(a)30.service → >> > /lib/systemd/system/ceph-osd@.service. >> > [2020-03-31 17:17:24,872][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl start ceph-osd@31 >> > [2020-03-31 17:17:24,875][ceph_volume.process][INFO ] stderr Created >> > symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd(a)33.service → >> > /lib/systemd/system/ceph-osd@.service. >> > [2020-03-31 17:17:25,072][ceph_volume.process][INFO ] stderr Created >> > symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd(a)32.service → >> > /lib/systemd/system/ceph-osd@.service. >> > [2020-03-31 17:17:25,075][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl start ceph-osd@30 >> > [2020-03-31 17:17:25,282][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl start ceph-osd@33 >> > [2020-03-31 17:17:25,497][ceph_volume.process][INFO ] stderr Created >> > symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd(a)28.service → >> > /lib/systemd/system/ceph-osd@.service. >> > [2020-03-31 17:17:25,499][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl start ceph-osd@32 >> > [2020-03-31 17:17:25,520][ceph_volume.process][INFO ] stderr Created >> > symlink /run/systemd/system/ceph-osd.target.wants/ceph-osd(a)29.service → >> > /lib/systemd/system/ceph-osd@.service. >> > [2020-03-31 17:17:25,705][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl start ceph-osd@28 >> > [2020-03-31 17:17:25,887][ceph_volume.process][INFO ] Running command: >> > /bin/systemctl start ceph-osd@29 >> > >> > -- >> > Lomayani >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users(a)ceph.io >> > To unsubscribe send an email to ceph-users-leave(a)ceph.io >> >> >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io >>

4 years

1
0
0 0

Re: [Octopus] Beware the on-disk conversion

by Dan van der Ster

Doh, I hope so! On Wed, Apr 1, 2020 at 5:35 PM Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote: > > April fools day!!!!!! :) > > > -----Original Message----- > Sent: 01 April 2020 17:28 > To: ceph-users(a)ceph.io > Subject: [ceph-users] [Octopus] Beware the on-disk conversion > > Hi, > > As the upgrade documentation tells: > > Note that the first time each OSD starts, it will do a format > > conversion to improve the accounting for omap data. This may take > a > > few minutes to as much as a few hours (for an HDD with lots of omap > > data). You can disable this automatic conversion with: > > What the documentation does not say is that this process takes a lot of > memory > > I am upgrading a rusty cluster from Nautilus, you can check out the ram > consumption as attachment > > First, we have a 3TB osd conversion: it tooks ~15min, and 19GB of memory > > Then, we have a larger 6TB osd conversion: it tooks more than 2 hours, > and 35GB of memory > > Finally, you have the largest 10TB osd: only 1H15, but 52GB of memory > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an > email to ceph-users-leave(a)ceph.io > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

4 years

2
1
0 0

Bluestore compression parameters in ceph.conf not used in mimic 13.2.8?

by Frank Schilder

Dear all, I have two observations regarding bluestore compression config: 1) ceph.conf settings seem to be ignored. 2) The SSD default values seem not to save space using compression. To 1) We are running a mimic 13.2.8 cluster with OSDs deployed under mimic 13.2.2. Back then the interpretation of compression parameters was messed up, which has been fixed along the way from 13.2.2 to 13.2.8. To get it to work properly under 13.2.2 I needed to include these settings in ceph.conf: [osd] bluestore compression mode = aggressive bluestore compression min blob size hdd = 262144 and then also enable compression on all pools that should use compression. These settings are still present in ceph.conf, but they seem to be ignored when populating the config data base on mon startup or querying config parameters: # ceph config get osd.16 bluestore_compression_min_blob_size_hdd 131072 However: # ceph tell osd.16 config get bluestore_compression_min_blob_size_hdd 262144 and: # ceph config show osd.16 NAME VALUE SOURCE OVERRIDES IGNORES bluestore_compression_min_blob_size_hdd 262144 file This is really confusing. Is this intended? Which values will be used when deploying new OSDs? In general, it would really be helpful if one could query daemon/parameter groups as in " ceph config get osd bluestore_compression_min_blob_size_hdd" to get a list right away. To 2) In a long-long-ago discussion about how compression works, I was told that a blob of bluestore_compression_min_blob_size will be compressed and then distributed over a number of allocations of bluestore_min_alloc_size. The defaults for HDD and SSD are: bluestore_compression_min_blob_size_hdd 131072 bluestore_min_alloc_size_hdd 65536 bluestore_compression_min_blob_size_ssd 8192 bluestore_min_alloc_size_ssd 16384 If this explanation of the compression method is correct, these defaults allow up to 50% savings for HDD, but, erm, 0% for SSD as the uncompressed blob will use the same amount of space as the compressed one as both will require the same allocation size. Did something change here? Are compressed blobs now co-located in allocations? Thanks for your help, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

4 years

2
2
0 0

2024

2023

2022

2021

2020

2019

ceph-users April 2020