Hi all,
on an NFS re-export of a ceph-fs (kernel client) I observe a very strange error. I'm un-taring a larger package (1.2G) and after some time I get these errors:
ln: failed to create hard link 'file name': Read-only file system
The strange thing is that this seems only temporary. When I used "ln src dst" for manual testing, the command failed as above. However, after that I tried "ln -v src dst" and this command created the hard link with exactly the same path arguments. During the period when the error occurs, I can't see any FS in read-only mode, neither on the NFS client nor the NFS server. Funny thing is that file creation and write still works, its only the hard-link creation that fails.
For details, the set-up is:
file-server: mount ceph-fs at /shares/path, export /shares/path as nfs4 to other server
other server: mount /shares/path as NFS
More precisely, on the file-server:
fstab: MON-IPs:/shares/folder /shares/nfs/folder ceph defaults,noshare,name=NAME,secretfile=sec.file,mds_namespace=FS-NAME,_netdev 0 0
exports: /shares/nfs/folder -no_root_squash,rw,async,mountpoint,no_subtree_check DEST-IP
On the host at DEST-IP:
fstab: FILE-SERVER-IP:/shares/nfs/folder /mnt/folder nfs defaults,_netdev 0 0
Both, the file server and the client server are virtual machines. The file server is on Centos 8 stream (4.18.0-338.el8.x86_64) and the client machine is on AlmaLinux 8 (4.18.0-425.13.1.el8_7.x86_64).
When I change the NFS export from "async" to "sync" everything works. However, that's a rather bad workaround and not a solution. Although this looks like an NFS issue, I'm afraid it is a problem with hard links and ceph-fs. It looks like a race with scheduling and executing operations on the ceph-fs kernel mount.
Has anyone seen something like that?
Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
On Thu, Dec 15, 2022 at 9:32 AM Stolte, Felix <f.stolte(a)fz-juelich.de> wrote:
>
> Hi Patrick,
>
> we used your script to repair the damaged objects on the weekend and it went smoothly. Thanks for your support.
>
> We adjusted your script to scan for damaged files on a daily basis, runtime is about 6h. Until thursday last week, we had exactly the same 17 Files. On thursday at 13:05 a snapshot was created and our active mds crashed once at this time (snapshot was created):
>
> 2022-12-08T13:05:48.919+0100 7f440afec700 -1 /build/ceph-16.2.10/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7f440afec700 time 2022-12-08T13:05:48.921223+0100
> /build/ceph-16.2.10/src/mds/ScatterLock.h: 59: FAILED ceph_assert(state LOCK_XLOCK || state LOCK_XLOCKDONE)
>
> 12 Minutes lates the unlink_local error crashes appeared again. This time with a new file. During debugging we noticed a MTU mismatch between MDS (1500) and client (9000) with cephfs kernel mount. The client is also creating the snapshots via mkdir in the .snap directory.
>
> We disabled snapshot creation for now, but really need this feature. I uploaded the mds logs of the first crash along with the information above to https://tracker.ceph.com/issues/38452
>
> I would greatly appreciate it, if you could answer me the following question:
>
> Is the Bug related to our MTU Mismatch? We fixed the MTU Issue going back to 1500 on all nodes in the ceph public network on the weekend also.
I doubt it.
> If you need a debug level 20 log of the ScatterLock for further analysis, i could schedule snapshots at the end of our workdays and increase the debug level 5 Minutes arround snap shot creation.
This would be very helpful!
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
hi Ernesto and lists,
> [1] https://github.com/ceph/ceph/pull/47501
are we planning to backport this to quincy so we can support centos 9
there? enabling that upgrade path on centos 9 was one of the
conditions for dropping centos 8 support in reef, which i'm still keen
to do
if not, can we find another resolution to
https://tracker.ceph.com/issues/58832? as i understand it, all of
those python packages exist in centos 8. do we know why they were
dropped for centos 9? have we looked into making those available in
epel? (cc Ken and Kaleb)
On Fri, Sep 2, 2022 at 12:01 PM Ernesto Puerta <epuertat(a)redhat.com> wrote:
>
> Hi Kevin,
>
>>
>> Isn't this one of the reasons containers were pushed, so that the packaging isn't as big a deal?
>
>
> Yes, but the Ceph community has a strong commitment to provide distro packages for those users who are not interested in moving to containers.
>
>> Is it the continued push to support lots of distros without using containers that is the problem?
>
>
> If not a problem, it definitely makes it more challenging. Compiled components often sort this out by statically linking deps whose packages are not widely available in distros. The approach we're proposing here would be the closest equivalent to static linking for interpreted code (bundling).
>
> Thanks for sharing your questions!
>
> Kind regards,
> Ernesto
> _______________________________________________
> Dev mailing list -- dev(a)ceph.io
> To unsubscribe send an email to dev-leave(a)ceph.io
Hi all,
our monitors have enjoyed democracy since the beginning. However, I don't share a sudden excitement about voting:
2/9/23 4:42:30 PM[INF]overall HEALTH_OK
2/9/23 4:42:30 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4)
2/9/23 4:42:26 PM[INF]mon.ceph-01 calling monitor election
2/9/23 4:42:26 PM[INF]mon.ceph-26 calling monitor election
2/9/23 4:42:26 PM[INF]mon.ceph-25 calling monitor election
2/9/23 4:42:26 PM[INF]mon.ceph-02 calling monitor election
2/9/23 4:40:00 PM[INF]overall HEALTH_OK
2/9/23 4:30:00 PM[INF]overall HEALTH_OK
2/9/23 4:24:34 PM[INF]overall HEALTH_OK
2/9/23 4:24:34 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4)
2/9/23 4:24:29 PM[INF]mon.ceph-01 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-02 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-03 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-01 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-26 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-25 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-02 calling monitor election
2/9/23 4:24:04 PM[INF]overall HEALTH_OK
2/9/23 4:24:03 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4)
2/9/23 4:23:59 PM[INF]mon.ceph-01 calling monitor election
2/9/23 4:23:59 PM[INF]mon.ceph-02 calling monitor election
2/9/23 4:20:00 PM[INF]overall HEALTH_OK
2/9/23 4:10:00 PM[INF]overall HEALTH_OK
2/9/23 4:00:00 PM[INF]overall HEALTH_OK
2/9/23 3:50:00 PM[INF]overall HEALTH_OK
2/9/23 3:43:13 PM[INF]overall HEALTH_OK
2/9/23 3:43:13 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4)
2/9/23 3:43:08 PM[INF]mon.ceph-01 calling monitor election
2/9/23 3:43:08 PM[INF]mon.ceph-26 calling monitor election
2/9/23 3:43:08 PM[INF]mon.ceph-25 calling monitor election
We moved a switch from one rack to another and after the switch came beck up, the monitors frequently bitch about who is the alpha. How do I get them to focus more on their daily duties again?
Thanks for any help!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
I am running Ceph 15.2.13 on CentOS 7.9.2009 and recently my MDS servers
have started failing with the error message
In function 'void Server::handle_client_open(MDRequestRef&)' thread
7f0ca9908700 time 2021-06-28T09:21:11.484768+0200
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/15.2.13/rpm/el7/BUILD/ceph-15.2.13/src/mds/Server.cc:
4149: FAILED ceph_assert(cur->is_auth())
Complete log is:
https://gist.github.com/pvanheus/4da555a6de6b5fa5e46cbf74f5500fbd
ceph status output is:
# ceph status
cluster:
id: ed7b2c16-b053-45e2-a1fe-bf3474f90508
health: HEALTH_WARN
30 OSD(s) experiencing BlueFS spillover
insufficient standby MDS daemons available
1 MDSs report slow requests
2 mgr modules have failed dependencies
4347046/326505282 objects misplaced (1.331%)
6 nearfull osd(s)
23 pgs not deep-scrubbed in time
23 pgs not scrubbed in time
8 pool(s) nearfull
services:
mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 22m)
mgr: ceph-mon1(active, since 11w), standbys: ceph-mon2, ceph-mon3
mds: SANBI_FS:2 {0=ceph-mon1=up:active(laggy or
crashed),1=ceph-mon2=up:stopping}
osd: 54 osds: 54 up (since 2w), 54 in (since 11w); 50 remapped pgs
data:
pools: 8 pools, 833 pgs
objects: 42.37M objects, 89 TiB
usage: 159 TiB used, 105 TiB / 264 TiB avail
pgs: 4347046/326505282 objects misplaced (1.331%)
782 active+clean
49 active+clean+remapped
1 active+clean+scrubbing+deep
1 active+clean+remapped+scrubbing
io:
client: 29 KiB/s rd, 427 KiB/s wr, 37 op/s rd, 48 op/s wr
When restarting a MDS it goes through states replace, reconnect, resolve
and finally sets itself to active before this crash happens.
Any advice on what to do?
Thanks,
Peter
P.S. apologies if you received this email more than once - I have had some
trouble figuring out the correct mailing list to use.
Hi Team,
We have a ceph cluster with 3 storage nodes:
1. storagenode1 - abcd:abcd:abcd::21
2. storagenode2 - abcd:abcd:abcd::22
3. storagenode3 - abcd:abcd:abcd::23
The requirement is to mount ceph using the domain name of MON node:
Note: we resolved the domain name via DNS server.
For this we are using the command:
```
mount -t ceph [storagenode.storage.com]:6789:/ /backup -o
name=admin,secret=AQCM+8hjqzuZEhAAcuQc+onNKReq7MV+ykFirg==
```
We are getting the following logs in /var/log/messages:
```
Jan 24 17:23:17 localhost kernel: libceph: resolve 'storagenode.storage.com'
(ret=-3): failed
Jan 24 17:23:17 localhost kernel: libceph: parse_ips bad ip '
storagenode.storage.com:6789'
```
We also tried mounting ceph storage using IP of MON which is working fine.
Query:
Could you please help us out with how we can mount ceph using FQDN.
My /etc/ceph/ceph.conf is as follows:
[global]
ms bind ipv6 = true
ms bind ipv4 = false
mon initial members = storagenode1,storagenode2,storagenode3
osd pool default crush rule = -1
fsid = 7969b8a3-1df7-4eae-8ccf-2e5794de87fe
mon host =
[v2:[abcd:abcd:abcd::21]:3300,v1:[abcd:abcd:abcd::21]:6789],[v2:[abcd:abcd:abcd::22]:3300,v1:[abcd:abcd:abcd::22]:6789],[v2:[abcd:abcd:abcd::23]:3300,v1:[abcd:abcd:abcd::23]:6789]
public network = abcd:abcd:abcd::/64
cluster network = eff0:eff0:eff0::/64
[osd]
osd memory target = 4294967296
[client.rgw.storagenode1.rgw0]
host = storagenode1
keyring = /var/lib/ceph/radosgw/ceph-rgw.storagenode1.rgw0/keyring
log file = /var/log/ceph/ceph-rgw-storagenode1.rgw0.log
rgw frontends = beast endpoint=[abcd:abcd:abcd::21]:8080
rgw thread pool size = 512
--
~ Lokendra
skype: lokendrarathour
Hello,
What's the status with the *-stable-* tags?
https://quay.io/repository/ceph/daemon?tab=tags
No longer build/support?
What should we use until we'll migrate from ceph-ansible to cephadm?
Thanks.
--
Jonas
Details of this release are summarized here:
https://tracker.ceph.com/issues/59070#note-1
Release Notes - TBD
The reruns were in the queue for 4 days because of some slowness issues.
The core team (Neha, Radek, Laura, and others) are trying to narrow
down the root cause.
Seeking approvals/reviews for:
rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to test
and merge at least one PR https://github.com/ceph/ceph/pull/50575 for
the core)
rgw - Casey
fs - Venky (the fs suite has an unusually high amount of failed jobs,
any reason to suspect it in the observed slowness?)
orch - Adam King
rbd - Ilya
krbd - Ilya
upgrade/octopus-x - Laura is looking into failures
upgrade/pacific-x - Laura is looking into failures
upgrade/quincy-p2p - Laura is looking into failures
client-upgrade-octopus-quincy-quincy - missing packages, Adam Kraitman
is looking into it
powercycle - Brad
ceph-volume - needs a rerun on merged
https://github.com/ceph/ceph-ansible/pull/7409
Please reply to this email with approval and/or trackers of known
issues/PRs to address them.
Also, share any findings or hypnosis about the slowness in the
execution of the suite.
Josh, Neha - gibba and LRC upgrades pending major suites approvals.
RC release - pending major suites approvals.
Thx
YuriW
Hello,
After a successful upgrade of a Ceph cluster from 16.2.7 to 16.2.11, I needed to downgrade it back to 16.2.7 as I found an issue with the new version.
I expected that running the downgrade with:`ceph orch upgrade start --ceph-version 16.2.7` should have worked fine. However, it blocked right after the downgrade of the first MGR daemon. In fact, the downgraded daemon is not able to use the cephadm module anymore. Any `ceph orch` command fails with the following error:
```
$ ceph orch ps
Error ENOENT: Module not found
```
And the downgrade process is therefore blocked.
These are the logs of the MGR when issuing the command:
```
Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.557+0000 7f828fe8c700 0 log_channel(audit) log [DBG] : from='client.3136173 -' entity='client.admin' cmd=[{"prefix": "orch ps", "target": ["mon-mgr", ""]}]: dispatch
Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 0 [orchestrator DEBUG root] _oremote orchestrator -> cephadm.list_daemons(*(None, None), **{'daemon_id': None, 'host': None, 'refresh': False})
Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 no module 'cephadm'
Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 0 [orchestrator DEBUG root] _oremote orchestrator -> cephadm.get_feature_set(*(), **{})
Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 no module 'cephadm'
Mar 28 12:13:15 astano03 ceph-c57586c4-8e44-11eb-a116-248a07aa8d2e-mgr-astano03-qtzccn[2232770]: debug 2023-03-28T10:13:15.558+0000 7f829068d700 -1 mgr.server reply reply (2) No such file or directory Module not found
```
Other interesting MGR logs are:
```
2023-03-28T11:05:59.519+0000 7fcd16314700 4 mgr get_store get_store key: mgr/cephadm/upgrade_state
2023-03-28T11:05:59.519+0000 7fcd16314700 -1 mgr load Failed to construct class in 'cephadm'
2023-03-28T11:05:59.519+0000 7fcd16314700 -1 mgr load Traceback (most recent call last):
e "/usr/share/ceph/mgr/cephadm/module.py", line 450, in __init__
elf.upgrade = CephadmUpgrade(self)
e "/usr/share/ceph/mgr/cephadm/upgrade.py", line 111, in __init__
elf.upgrade_state: Optional[UpgradeState] = UpgradeState.from_json(json.loads(t))
e "/usr/share/ceph/mgr/cephadm/upgrade.py", line 92, in from_json
eturn cls(**c)
rror: __init__() got an unexpected keyword argument 'daemon_types'
2023-03-28T11:05:59.521+0000 7fcd16314700 -1 mgr operator() Failed to run module in active mode ('cephadm')
```
Which seem to relate to the new feature of staggered upgrades.
Please note that before, everything was working fine with version 16.2.7.
I am currently stuck in this situation with only one MGR daemon on version 16.2.11 which is the only one still working fine:
```
[root@astano01 ~]# ceph orch ps | grep mgr
mgr.astano02.mzmewn astano02 *:8443,9283 running (5d) 43s ago 2y 455M - 16.2.11 7a63bce27215 e2d7806acf16
mgr.astano03.qtzccn astano03 *:8443,9283 running (3m) 22s ago 95m 383M - 16.2.7 463ec4b1fdc0 cc0d88864fa1
```
Does anyone already faced this issue or knows how can I make the 16.2.7 MGR load the cephadm module correctly?
Thanks in advance for any help!
Dear all;
Up until a few hours ago, I had a seemingly normally-behaving cluster
(Quincy, 17.2.5) with 36 OSDs, evenly distributed across 3 of its 6
nodes. The cluster is only used for CephFS and the only non-standard
configuration I can think of is that I had 2 active MDSs, but only 1
standby. I had also doubled mds_cache_memory limit to 8 GB (all OSD
hosts have 256 G of RAM) at some point in the past.
Then I rebooted one of the OSD nodes. The rebooted node held one of the
active MDSs. Now the node is back up: ceph -s says the cluster is
healthy, but all PGs are in a active+clean+remapped state and 166.67% of
the objects are misplaced (dashboard: -66.66% healthy).
The data pool is a threefold replica with 5.4M object, the number of
misplaced objects is reported as 27087410/16252446. The denominator in
the ratio makes sense to me (16.2M / 3 = 5.4M), but the numerator does
not. I also note that the ratio is *exactly* 5 / 3. The filesystem is
still mounted and appears to be usable, but df reports it as 100% full;
I suspect it would say 167% but that is capped somewhere.
Any ideas about what is going on? Any suggestions for recovery?
// Best wishes; Johan