September 2020 - ceph-users

librados: rados_cache_pin returning Invalid argument. need help

by Daniel Mezentsev

Hi ceph-users, Im bit stuck with librados, partcularly rados_cache_pin finction. For some reason it returns "Invalid argument", error code 22. Can't find what im doing wrong. in rados_cache_pin im using the same ioctx and obj name like i use with rados_write - which works just fine. Any help is more then welcome. Daniel Mezentsev, founder (+1) 604 313 8592. Soleks Data Group. Shaping the clouds.

3 years, 7 months

1
0
0 0

cephadm orch thinks hosts are offline

by xahare＠gmail.com

We have a 5 node cluster, all monitors, installed with cephadm. recently, the hosts needed to be rebooted for upgrades, but as we rebooted them, hosts fail their cephadm check. as you can see ceph1 is in quorum and is the host the command is run from. following is the output of ceph -s and ceph orch host ls. ceph orch pause and resume only removed the "offline" status of cephmon-temp, which really is offline. how do we fix ceph orchs confusion? the third is a temp node we had that ceph orch remove host couldnt get rid of. ceph1:~# ceph -s health: HEALTH_WARN 3 hosts fail cephadm check services: mon: 5 daemons, quorum ceph5,ceph4,ceph3,ceph2,ceph1 (age 2d) mgr: ceph3.dmpmih(active, since 3w), standbys: ceph5.pwseyi osd: 30 osds: 30 up (since 2d), 30 in (since 3w) data: pools: 2 pools, 129 pgs objects: 2.29M objects, 8.7 TiB usage: 22 TiB used, 87 TiB / 109 TiB avail pgs: 129 active+clean io: client: 149 KiB/s wr, 0 op/s rd, 14 op/s wr ceph1:~# ceph orch host ls HOST ADDR LABELS STATUS ceph1 ceph1 mon Offline ceph2 ceph2 mon Offline ceph3 ceph3 mon ceph4 ceph4 mon ceph5 ceph5 mon cephmon-temp cephmon-temp Offline

3 years, 7 months

1
0
0 0

Re: bug of the year (with compressed omap and lz 1.7(?))

by Dan van der Ster

Hi Marc, As far as I know, the osdmap corruption occurs with this osd config: bluestore_compression_mode=aggressive bluestore_compression_algorithm=lz4 (My understanding is that if you don't have the above settings, but use pool-specific compression settings instead, then the osdmaps are not compressed, so they would not be corruptable). I prefer to stay on the safe side, though: since we know that lz4 < 1.8.2 can corrupt data if comes from fragmented memory, and we don't know all of the cases where that might be the case, I suggest to use another algorithm or turn off compression until you have upgraded to v14.2.10 or newer. And btw, there was another bug earlier in nautilus where those pool-specific compression settings weren't applied correctly anyway, so I'm not sure they even work yet in 14.2.9. -- dan On Sat, Sep 5, 2020 at 6:12 PM Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote: > > > I am still running 14.2.9 with lz4-1.7.5-3. Will I run into this bug > enabling compression on a pool with: > > ceph osd pool set POOL_NAME compression_algorithm COMPRESSION_ALGORITHM > ceph osd pool set POOL_NAME compression_mode COMPRESSION_MODE > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

3 years, 7 months

2
1
0 0

PG number per OSD

by huxiaoyu＠horebdata.cn

Dear Ceph folks, As the capacity of one HDD (OSD) is growing bigger and bigger, e.g. from 6TB up to 18TB or even more, should the number of PG per OSD increase as well, e.g. for 200 to 800. As far as i know, the capacity of each PG should be set smaller for performance reasons due to the existence of PG locks, thus shall i set the number of PGs per OSD to 1000 or even 2000? what is the actual reason for not setting the number of PGs per OSD? Is there any practical limations on the number of PGs? thanks a lot, Samuel huxiaoyu(a)horebdata.cn

3 years, 7 months

1
0
0 0

bug of the year (with compressed omap and lz 1.7(?))

by Marc Roos

I am still running 14.2.9 with lz4-1.7.5-3. Will I run into this bug enabling compression on a pool with: ceph osd pool set POOL_NAME compression_algorithm COMPRESSION_ALGORITHM ceph osd pool set POOL_NAME compression_mode COMPRESSION_MODE

3 years, 7 months

1
0
0 0

Assignment Help, Essay Help and Dissertation Help

by australianassignmenthelp1＠gmail.com

Students are often faced with the issue of preparing their assignment during their studies. They often seek for help from professional service providers and one such service provider is Student Life Saviour that offers assistance at: https://studentlifesaviour.com/sg

3 years, 7 months

1
0
0 0

Migrating Luminous → Nautilus "Required devices (data, and journal) not present for filestore"

by Stuart Longland

Hi all, I'm trying to migrate my 5-node cluster to Nautilus… Debian Stretch (I intend to update this once I'm onto Nautilus) as the OS, all 5 machines are combined OSD/MON/MDS nodes. Nodes were originally deployed with an earlier version of Ceph (can't recall now) using filestore on btrfs. Later, after an abortive attempt to move to Bluestore (it ran… veeeeerrryyy ssslllooowwwlllyyy), I moved back to filestore on xfs, which whilst no speed demon, worked well enough. Journal is on the OSD due to space constraints elsewhere in the nodes. So step 1: > If you are unsure whether or not your Luminous cluster has completed a full scrub of all PGs, you can check your cluster’s state by running: > > # ceph osd dump | grep ^flags > > In order to be able to proceed to Nautilus, your OSD map must include the recovery_deletes and purged_snapdirs flags. Tick to that, I see both flags. Step 2. > Make sure your cluster is stable and healthy (no down or recovering OSDs). (Optional, but recommended.) Tick, all healthy. Step 3. > Set the noout flag for the duration of the upgrade. (Optional, but recommended.): Done Step 4. > Upgrade monitors by installing the new packages and restarting the monitor daemons. Okay, so I update /etc/apt/sources.list.d/ceph.list to point to the `nautilus` repository, `apt-get update`, `apt-get dist-upgrade -y`. This goes without a hitch, I now have Ceph 12 binaries on my nodes. > systemctl restart ceph-mon.target Ran that, on all of them (and yes, they are all in the quorum)… > # ceph mon dump | grep min_mon_release > > should report: > > min_mon_release 14 (nautilus) I get this:> root@helium:~# ceph mon dump | grep min_mon_release > dumped monmap epoch 4 No `min_mon_release` mentioned anywhere. I tried re-starting the MON daemons on all 5 nodes, even doing it in parallel using Ansible… no dice. Nothing says what to do at this point. It's not reporting an earlier release, it's just not reporting full stop. Figuring that, well, I *did* update all the monitors, and re-start them, I press on. Step 5. > Upgrade ceph-mgr daemons by installing the new packages and restarting all manager daemons. Done, no issues. > Verify the ceph-mgr daemons are running by checking ceph -s: I get: > root@helium:~# ceph -s > cluster: > id: 45b532b7-aa3d-4754-9906-d4a70b57630c > health: HEALTH_WARN > noout flag(s) set > > services: > mon: 5 daemons, quorum hydrogen,helium,carbon,nitrogen,boron > mgr: hydrogen(active), standbys: carbon, helium > mds: cephfs-1/1/1 up {0=helium=up:active}, 2 up:standby > osd: 5 osds: 5 up, 5 in > flags noout > > data: > pools: 4 pools, 324 pgs > objects: 302.75k objects, 1.13TiB > usage: 3.52TiB used, 5.53TiB / 9.05TiB avail > pgs: 323 active+clean > 1 active+clean+scrubbing+deep > > io: > client: 45.0KiB/s rd, 1.10MiB/s wr, 1op/s rd, 38op/s wr So far so good. Step 6. > Upgrade all OSDs by installing the new packages and restarting the ceph-osd daemons on all OSD hosts: > systemctl restart ceph-osd.target I actually missed this step at first and went on to step 7, but doubled back and did it. Step 7. > If there are any OSDs in the cluster deployed with ceph-disk (e.g., almost any OSDs that were created before the Mimic release), you need to tell ceph-volume to adopt responsibility for starting the daemons. Okay, seems simple enough. Now I did this by mistake earlier, so I'll force it to ensure everything is up to scratch. > root@hydrogen:~# ceph-volume simple scan --force > stderr: lsblk: /var/lib/ceph/osd/ceph-5: not a block device > stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected. > Running command: /sbin/cryptsetup status /dev/sdc1 > --> OSD 5 got scanned and metadata persisted to file: /etc/ceph/osd/5-2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe.json > --> To take over managment of this scanned OSD, and disable ceph-disk and udev, run: > --> ceph-volume simple activate 5 2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe > root@hydrogen:~# ceph-volume simple activate --all > --> activating OSD specified in /etc/ceph/osd/5-2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe.json > --> Required devices (data, and journal) not present for filestore > --> filestore devices found: [u'data'] > --> RuntimeError: Unable to activate filestore OSD due to missing devices Uhh ohh? So, my journal is on the OSD itself: > root@hydrogen:~# ls -l /var/lib/ceph/osd/ceph-5/journal > -rw-r--r-- 1 ceph ceph 21474836480 Aug 29 14:16 /var/lib/ceph/osd/ceph-5/journal > root@hydrogen:~# cat /etc/ceph/osd/5-2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe.json > { > "active": "ok", > "ceph_fsid": "45b532b7-aa3d-4754-9906-d4a70b57630c", > "cluster_name": "ceph", > "data": { > "path": "/dev/sdc1", > "uuid": "2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe" > }, > "fsid": "2a1e6a2b-7742-4cd9-9a39-e4a35ebe74fe", > "keyring": "…censored…", > "magic": "ceph osd volume v026", > "ready": "ready", > "require_osd_release": "", > "systemd": "", > "type": "filestore", > "whoami": 5 > } How do I tell it that the journal is on `/dev/sdc1` as a file within the XFS file store? -- Stuart Longland (aka Redhatter, VK4MSL) I haven't lost my mind... ...it's backed up on a tape somewhere.

3 years, 7 months

1
2
0 0

RadosGW and DNS Round-Robin

by DHilsbos＠performair.com

All; We've been running RadosGW on our nautilus cluster for a while, and we're going to be adding iSCSI capabilities to our cluster, via 2 additional servers. I intend to also run RadosGW on these servers. That begs the question of how to "load balance" these servers. I don't believe that we need true load balancing (i.e. through a dedicated proxy), and I'd rather not add the complexity and single point of failure. The question then is: Does RadosGW play nice with round-robin DNS? The real question here is whether RadosGW maintains internal client state locally between connections. I would expect it's safe, given that it is HTTP, but I'd prefer to verify. Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. DHilsbos(a)PerformAir.com www.PerformAir.com

3 years, 7 months

3
2
0 0

Ceph iSCSI Questions

by DHilsbos＠performair.com

All; We've used iSCSI to support virtualization for a while, and have used multi-pathing almost the entire time. Now, I'm looking to move from our single box iSCSI hosts to iSCSI on Ceph. We have 2 independent, non-routed, subnets assigned to iSCSI (let's call them 192.168.250.0/24 and 192.168.251.0/24). These subnets are hosted in VLANs 250 and 251, respectively, on our switches. Currently; each target and each initiator have a dedicated network port for each subnet (i.e. 2 NIC per target & 2 NIC per initiator). I have 2 server prepared to setup as Ceph iSCSI targets (let's call them ceph-iscsi1 & cpeh-iscsi2), and I'm wondering about their network configurations. My initial plan is to configure one on the 250 network, and the other on the 251 network. Would it be possible to have both servers on both networks? In other words, can I give ceph-iscsi1 both 192.168.250.200 and 192.168.251.200, and ceph-iscsi2 192.168.250.201 and 192.168.251.201? If that works, I would expect the initiators to see 4 paths to each portal, correct? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. DHilsbos(a)PerformAir.com www.PerformAir.com

3 years, 7 months

2
1
0 0

Re: cephadm & iSCSI

by Robert Sander

Hi, I am using 15.2.4 as .5 has not been released. Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 93818 B Geschäftsführer: Peer Heinlein - Sitz: Berlin

3 years, 7 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users September 2020