January 2024 - ceph-users

NFS HA - "virtual_ip": null after upgrade to reef

by Torkil Svensgaard

Hi Last week we created an NFS service like this: " ceph nfs cluster create jumbo "ceph-flash1,ceph-flash2,ceph-flash3" --ingress --virtual_ip 172.21.15.74/22 --ingress-mode keepalive-only " Worked like a charm. Yesterday we upgraded from 17.2.7 to 18.20.0 and the NFS virtual IP seems to have gone missing in the process: " # ceph nfs cluster info jumbo { "jumbo": { "backend": [ { "hostname": "ceph-flash1", "ip": "172.21.15.148", "port": 2049 } ], "virtual_ip": null } } " Service spec: " service_type: nfs service_id: jumbo service_name: nfs.jumbo placement: count: 1 hosts: - ceph-flash1 - ceph-flash2 - ceph-flash3 spec: port: 2049 virtual_ip: 172.21.15.74 " I've tried restarting the nfs.jumbo service which didn't help. Suggestions? Mvh. Torkil -- Torkil Svensgaard Sysadmin MR-Forskningssektionen, afs. 714 DRCMR, Danish Research Centre for Magnetic Resonance Hvidovre Hospital Kettegård Allé 30 DK-2650 Hvidovre Denmark Tel: +45 386 22828 E-mail: torkil(a)drcmr.dk

3 months, 1 week

2
7
0 0

how to avoid pglogs dups bug in Pacific

by ADRIAN NICOLAE

Hi, I'm running Pacific 16.2.4 and I want to start a manual pg split process on the data pool (from 2048 to 4096). I'm reluctant to upgrade to 16.2.14/15 at this point. Can I avoid the dups bug (https://tracker.ceph.com/issues/53729) if I will increase the pgs slowly with 32 or 64pgs at every increment instead of moving directly to 4096 ? I don't have the autoscaler enabled. Thanks.

3 months, 1 week

1
0
0 0

Re: RGW crashes when rgw_enable_ops_log is enabled

by Marc Singer

Hi Matt Thanks for your answer. Should I open a bug report then? How would I be able to read more from it? Have multiple threads access it and read from it simultaneously? Marc On 1/25/24 20:25, Matt Benjamin wrote: > Hi Marc, > > No, the only thing you need to do with the Unix socket is to keep > reading from it. So it probably is getting backlogged. And while you > could arrange things to make that less likely, you likely can't make > it impossible, so there's a bug here. > > Matt > > On Thu, Jan 25, 2024 at 10:52 AM Marc Singer <marc(a)singer.services> wrote: > > Hi > > I am using a unix socket client to connect with it and read the data > from it. > Do I need to do anything like signal the socket that this data has > been > read? Or am I not reading fast enough and data is backing up? > > What I am also noticing that at some point (probably after something > with the ops socket happens), the log level seems to increase for > some > reason? I did not find anything in the logs yet why this would be > the case. > > *Normal:* > > 2024-01-25T15:47:58.444+0000 7fe98a5c0b00 1 ====== starting new > request > req=0x7fe98712c720 ===== > 2024-01-25T15:47:58.548+0000 7fe98b700b00 1 ====== req done > req=0x7fe98712c720 op status=0 http_status=200 > latency=0.104001537s ====== > 2024-01-25T15:47:58.548+0000 7fe98b700b00 1 beast: 0x7fe98712c720: > redacted - redacted [25/Jan/2024:15:47:58.444 +0000] "PUT > /redacted/redacted/chunks/27/27242/27242514_10_4194304 HTTP/1.1" 200 > 4194304 - "redacted" - latency=0.104001537s > > *Close before crashing: > * > > -509> 2024-01-25T14:54:31.588+0000 7f5186648b00 1 ====== starting > new request req=0x7f517ffca720 ===== > -508> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s initializing for trans_id = > tx0000023a42eb7515dcdc0-0065b27627-823feaa-central > -507> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s getting op 1 > -506> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj verifying requester > -505> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj normalizing buckets > and tenants > -504> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj init permissions > -503> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj recalculating target > -502> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj reading permissions > -501> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj init op > -500> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj verifying op mask > -499> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj verifying op permissions > -498> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > 2568229052387020224 0.000000000s s3:put_obj Searching permissions for > identity=rgw::auth::SysReqApplier -> > rgw::auth::LocalApplier(acct_user=redacted, acct_name=redacted, > subuser=, perm_mask=15, is_admin=0) mask=50 > -497> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > 2568229052387020224 0.000000000s s3:put_obj Searching permissions for > uid=redacted > -496> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > 2568229052387020224 0.000000000s s3:put_obj Found permission: 15 > -495> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > 2568229052387020224 0.000000000s s3:put_obj Searching permissions for > group=1 mask=50 > -494> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > 2568229052387020224 0.000000000s s3:put_obj Permissions for group > not found > -493> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > 2568229052387020224 0.000000000s s3:put_obj Searching permissions for > group=2 mask=50 > -492> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > 2568229052387020224 0.000000000s s3:put_obj Permissions for group > not found > -491> 2024-01-25T14:54:31.588+0000 7f5186648b00 5 req > 2568229052387020224 0.000000000s s3:put_obj -- Getting permissions > done > for identity=rgw::auth::SysReqApplier -> > rgw::auth::LocalApplier(acct_user=redacted, acct_name=redacted, > subuser=, perm_mask=15, is_admin=0), owner=redacted, perm=2 > -490> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj verifying op params > -489> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj pre-executing > -488> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj check rate limiting > -487> 2024-01-25T14:54:31.588+0000 7f5186648b00 2 req > 2568229052387020224 0.000000000s s3:put_obj executing > -486> 2024-01-25T14:54:31.624+0000 7f5183898b00 5 req > 2568229052387020224 0.036000550s s3:put_obj NOTICE: call to > do_aws4_auth_completion > -485> 2024-01-25T14:54:31.624+0000 7f5183898b00 5 req > 2568229052387020224 0.036000550s s3:put_obj NOTICE: call to > do_aws4_auth_completion > -484> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req > 2568229052387020224 0.092001401s s3:put_obj completing > -483> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req > 2568229052387020224 0.092001401s s3:put_obj op status=0 > -482> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 2 req > 2568229052387020224 0.092001401s s3:put_obj http status=200 > > -481> 2024-01-25T14:54:31.680+0000 7f5185bc8b00 1 ====== req done > req=0x7f517ffca720 op status=0 http_status=200 > latency=0.092001401s ====== > > Thanks for your help. > > Marc Singer > > On 1/25/24 16:22, Matt Benjamin wrote: > > Hi Marc, > > > > The ops log code is designed to discard data if the socket is > > flow-controlled, iirc. Maybe we just need to handle the signal. > > > > Of course, you should have something consuming data on the > socket, but it's > > still a problem if radosgw exits unexpectedly. > > > > Matt > > > > On Thu, Jan 25, 2024 at 10:08 AM Marc > Singer<marc(a)singer.services> wrote: > > > >> Hi Ceph Users > >> > >> I am encountering a problem with the RGW Admin Ops Socket. > >> > >> I am setting up the socket as follows: > >> > >> rgw_enable_ops_log = true > >> rgw_ops_log_socket_path = /tmp/ops/rgw-ops.socket > >> rgw_ops_log_data_backlog = 16Mi > >> > >> Seems like the socket fills up over time and it doesn't seem to get > >> flushed, at some point the process runs out of file space. > >> > >> Do I need to configure something or send something for the > socket to flush? > >> > >> See the log here: > >> > >> 0> 2024-01-25T13:10:13.908+0000 7f247b00eb00 -1 *** Caught > signal (File > >> size limit exceeded) ** > >> in thread 7f247b00eb00 thread_name:ops_log_file > >> > >> ceph version 18.2.0 > (5dd24139a1eada541a3bc16b6941c5dde975e26d) reef > >> (stable) > >> NOTE: a copy of the executable, or `objdump -rdS > <executable>` is > >> needed to interpret this. > >> > >> --- logging levels --- > >> 0/ 5 none > >> 0/ 1 lockdep > >> 0/ 1 context > >> 1/ 1 crush > >> 1/ 5 mds > >> 1/ 5 mds_balancer > >> 1/ 5 mds_locker > >> 1/ 5 mds_log > >> 1/ 5 mds_log_expire > >> 1/ 5 mds_migrator > >> 0/ 1 buffer > >> 0/ 1 timer > >> 0/ 1 filer > >> 0/ 1 striper > >> 0/ 1 objecter > >> 0/ 5 rados > >> 0/ 5 rbd > >> 0/ 5 rbd_mirror > >> 0/ 5 rbd_replay > >> 0/ 5 rbd_pwl > >> 0/ 5 journaler > >> 0/ 5 objectcacher > >> 0/ 5 immutable_obj_cache > >> 0/ 5 client > >> 1/ 5 osd > >> 0/ 5 optracker > >> 0/ 5 objclass > >> 1/ 3 filestore > >> 1/ 3 journal > >> 0/ 0 ms > >> 1/ 5 mon > >> 0/10 monc > >> 1/ 5 paxos > >> 0/ 5 tp > >> 1/ 5 auth > >> 1/ 5 crypto > >> 1/ 1 finisher > >> 1/ 1 reserver > >> 1/ 5 heartbeatmap > >> 1/ 5 perfcounter > >> 1/ 5 rgw > >> 1/ 5 rgw_sync > >> 1/ 5 rgw_datacache > >> 1/ 5 rgw_access > >> 1/ 5 rgw_dbstore > >> 1/ 5 rgw_flight > >> 1/ 5 javaclient > >> 1/ 5 asok > >> 1/ 1 throttle > >> 0/ 0 refs > >> 1/ 5 compressor > >> 1/ 5 bluestore > >> 1/ 5 bluefs > >> 1/ 3 bdev > >> 1/ 5 kstore > >> 4/ 5 rocksdb > >> 4/ 5 leveldb > >> 1/ 5 fuse > >> 2/ 5 mgr > >> 1/ 5 mgrc > >> 1/ 5 dpdk > >> 1/ 5 eventtrace > >> 1/ 5 prioritycache > >> 0/ 5 test > >> 0/ 5 cephfs_mirror > >> 0/ 5 cephsqlite > >> 0/ 5 seastore > >> 0/ 5 seastore_onode > >> 0/ 5 seastore_odata > >> 0/ 5 seastore_omap > >> 0/ 5 seastore_tm > >> 0/ 5 seastore_t > >> 0/ 5 seastore_cleaner > >> 0/ 5 seastore_epm > >> 0/ 5 seastore_lba > >> 0/ 5 seastore_fixedkv_tree > >> 0/ 5 seastore_cache > >> 0/ 5 seastore_journal > >> 0/ 5 seastore_device > >> 0/ 5 seastore_backref > >> 0/ 5 alienstore > >> 1/ 5 mclock > >> 0/ 5 cyanstore > >> 1/ 5 ceph_exporter > >> 1/ 5 memstore > >> -2/-2 (syslog threshold) > >> 99/99 (stderr threshold) > >> --- pthread ID / name mapping for recent threads --- > >> 7f2472a89b00 / safe_timer > >> 7f2472cadb00 / radosgw > >> ... > >> log_file > >> > >> > /var/lib/ceph/crash/2024-01-25T13:10:13.909546Z_01ee6e6a-e946-4006-9d32-e17ef2f9df74/log > >> --- end dump of recent events --- > >> reraise_fatal: default handler for signal 25 didn't terminate > the process? > >> > >> Thank you for your help. > >> > >> Marc > >> _______________________________________________ > >> ceph-users mailing list --ceph-users(a)ceph.io > >> To unsubscribe send an email toceph-users-leave(a)ceph.io > >> > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > > > -- > > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309

3 months, 1 week

2
2
0 0

Ceph stretch mode connect to local datacenter

by Oleksandr 34

Hello. I have a ceph cluster which works in stretch mode: *DC1:* node1 (osd, mon, mgr) node2 (osd, mon) node3 (osd, mds) *DC2:* node1 (osd, mon, mgr) node2 (osd, mon) node3 (osd, mds) *DC3:* node1 (mon) Datacenters are distributions between different locations. I use RBD on my clients. How can I set up my clients to connect to the local datacenter? I don't need big traffic between datacenters only for replications, so my client in DC1 should connect to ceph in DC1. But when something happened with DC1 my clients in DC1 should still be working. In configs on my clients I set up all cluster monitors. Is this possible at all?

3 months, 1 week

1
0
0 0

Scrubbing?

by Jan Marek

Hello, last week I've got a HEALTH_OK on our CEPH cluster and I started upgrade firmware in network cards. When I had upgraded the sixth card from nine (one-by-one), this server didn't started correctly and our ProxMox had problem with accessing disk images on CEPH. rbd ls pool was OK, but: rbd ls pool -l didn't work. Our virtual servers had a trouble to work with disks. After I resolve network problem with OSD server, everythink returning to normal state. But I've found, that every OSD nod have very high activity: when I've started 'iotop', there was very high load: around 180MB/s read and 20MB/s write. In this time, cluster was in the HEALTH_OK state. I've found, that there is a massive scrubbing activity... After a few days, I have on our OSD nodes around 90MB/s read and 70MB/s write while 'ceph -s' have client io as 2,5MB/s read and 50MB/s write. I've found in log file of our mon server many lines about starting of scrubbing, but there are many messages about starting of scrubb the same PG? I've grep'ed syslog for some of them and attach it to this e-mail. Is this activity OK? Why CEPH start scrubing this PG once and once again? And another question: Is scrubbing part of mClock scheduler? Many thanks for explanation. Sincerely Jan Marek -- Ing. Jan Marek University of South Bohemia Academic Computer Centre Phone: +420389032080 http://www.gnu.org/philosophy/no-word-attachments.cs.html

3 months, 1 week

4
7
0 0

How check local network

by Albert Shih

Hi When I deploy my cluster I didn't notice on two of my servers the private network was not working (wrong vlan), now it's working, but how can I check the it's indeed working (currently I don't have data). Regards -- Albert SHIH 🦫 🐸 France Heure locale/Local time: lun. 29 janv. 2024 22:36:01 CET

3 months, 1 week

2
2
0 0

Unsetting maintenance mode for failed host

by Bryce Nicholls

Hi We put a host in maintenance and had issues bringing it back. Is there a safe way of exiting maintenance while the host is unreachable / offline? We would like the cluster to rebalance while we are working to get this host back online. Maintenance was set using: ceph orch host maintenance enter osd1 I tried exiting using: ceph orch host maintenance exit osd1 but got the below stacktrace. root@mon1 ~ # ceph orch host maintenance exit osd1 Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1756, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in handle_command return dispatch[cmd['prefix']].call(self, cmd, inbuf) File "/usr/share/ceph/mgr/mgr_module.py", line 462, in call return self.func(mgr, **kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) # noqa: E731 File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper return func(*args, **kwargs) File "/usr/share/ceph/mgr/orchestrator/module.py", line 455, in _host_maintenance_exit raise_if_exception(completion) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 225, in raise_if_exception e = pickle.loads(c.serialized_exception) TypeError: __init__() missing 2 required positional arguments: 'hostname' and 'addr' Thanks Bryce Bryce Nicholls OpenStack Engineer Bryce.Nicholls92(a)thehutgroup.com [THG Ingenuity Logo]<https://www.thg.com> [https://i.imgur.com/wbpVRW6.png]<https://www.linkedin.com/company/thgplc/?originalSubdomain=uk> [https://i.imgur.com/c3040tr.png] <https://twitter.com/thgplc?lang=en>

3 months, 1 week

2
1
0 0

January Ceph Science Virtual User Group

by Kevin Hrpcek

Hey All, We will be having a Ceph science/research/big cluster call on Wednesday January 31st. If anyone wants to discuss something specific they can add it to the pad linked below. If you have questions or comments you can contact me. This is an informal open call of community members mostly from hpc/htc/research/big cluster environments (though anyone is welcome) where we discuss whatever is on our minds regarding ceph. Updates, outages, features, maintenance, etc...there is no set presenter but I do attempt to keep the conversation lively. Pad URL: https://pad.ceph.com/p/Ceph_Science_User_Group_20240131 Virtual event details: January 31, 2024 15:00 UTC 4pm Central European 9am Central US Description: Main pad for discussions: https://pad.ceph.com/p/Ceph_Science_User_Group_Index Meetings will be recorded and posted to the Ceph Youtube channel. To join the meeting on a computer or mobile phone: https://meet.jit.si/ceph-science-wg Kevin -- Kevin Hrpcek NASA VIIRS Atmosphere SIPS/TROPICS Space Science & Engineering Center University of Wisconsin-Madison

3 months, 1 week

1
0
0 0

RadosGW manual deployment

by Jan Kasprzak

Hi all, how can radosgw be deployed manually? For Ceph cluster deployment, there is still (fortunately!) a documented method which works flawlessly even in Reef: https://docs.ceph.com/en/latest/install/manual-deployment/#monitor-bootstra… But as for radosgw, there is no such description, unless I am missing something. Even going back to the oldest docs still available at docs.ceph.com (mimic), the radosgw installation is described only using ceph-deploy: https://docs.ceph.com/en/mimic/install/install-ceph-gateway/ Is it possible to install a new radosgw instance manually? If so, how can I do it? Thanks! -Yenya -- | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> | | https://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise. --Larry Wall

3 months, 1 week

3
9
0 0

easy way to find out the number of allocated objects for a RBD image

by Tony Liu

Hi, Other than get all objects of the pool and filter by image ID, is there any easier way to get the number of allocated objects for a RBD image? What I really want to know is the actual usage of an image. An allocated object could be used partially, but that's fine, no need to be 100% accurate. To get the object count and times object size, that should be sufficient. "rbd export" exports actual used data, but to get the actual usage by exporting the image seems too much. This brings up another question, is there any way to know the export size before running it? Thanks! Tony

3 months, 1 week

4
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users January 2024