umount + mount worked. Thanks!
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Dan van der Ster <dan(a)vanderster.com>
Sent: 30 October 2020 10:22:38
To: Frank Schilder
Cc: ceph-users
Subject: Re: [ceph-users] MDS_CLIENT_LATE_RELEASE: 3 clients failing to respond to
capability release
Hi,
You said you dropped caches -- can you try again echo 3 >
/proc/sys/vm/drop_caches ?
Otherwise, does umount then mount from one of the clients clear the warning?
(I don't believe this is due to a "busy client", but rather a kernel
client bug where it doesn't release caps in some cases -- we've seen
this in the past but not recently).
-- Dan
On Fri, Oct 30, 2020 at 10:13 AM Frank Schilder <frans(a)dtu.dk> wrote:
>
> Dear cephers,
>
> I have a somewhat strange situation. I have the health warning:
>
> # ceph health detail
> HEALTH_WARN 3 clients failing to respond to capability release
> MDS_CLIENT_LATE_RELEASE 3 clients failing to respond to capability release
> mdsceph-12(mds.0): Client sn106.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to
capability release client_id: 30716617
> mdsceph-12(mds.0): Client sn269.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to
capability release client_id: 30717358
> mdsceph-12(mds.0): Client sn009.hpc.ait.dtu.dk:con-fs2-hpc failing to respond to
capability release client_id: 30749150
>
> However, these clients are not busy right now. Also, they hold almost nothing; see
snippets from "session ls" below. It is possible that a very IO intensive
application was running on these nodes and these release requests got stuck. How do I
resolve this issue? Can I just evict the client?
>
> Version is mimic 13.2.8. Note that we execute a drop cache command after a job
finishes on these clients. Its possible that the clients dropped the caps already before
the MDS request was handled/received.
>
> Best regards,
> Frank
>
> {
> "id": 30717358,
> "num_leases": 0,
> "num_caps": 44,
> "state": "open",
> "request_load_avg": 0,
> "uptime": 6632206.332307,
> "replay_requests": 0,
> "completed_requests": 0,
> "reconnecting": false,
> "inst": "client.30717358 192.168.57.140:0/3212676185",
> "client_metadata": {
> "features": "00000000000000ff",
> "entity_id": "con-fs2-hpc",
> "hostname": "sn269.hpc.ait.dtu.dk",
> "kernel_version": "3.10.0-957.12.2.el7.x86_64",
> "root": "/hpc/home"
> }
> },
> --
> {
> "id": 30716617,
> "num_leases": 0,
> "num_caps": 48,
> "state": "open",
> "request_load_avg": 1,
> "uptime": 6632206.336307,
> "replay_requests": 0,
> "completed_requests": 1,
> "reconnecting": false,
> "inst": "client.30716617 192.168.56.233:0/2770977433",
> "client_metadata": {
> "features": "00000000000000ff",
> "entity_id": "con-fs2-hpc",
> "hostname": "sn106.hpc.ait.dtu.dk",
> "kernel_version": "3.10.0-957.12.2.el7.x86_64",
> "root": "/hpc/home"
> }
> },
> --
> {
> "id": 30749150,
> "num_leases": 0,
> "num_caps": 44,
> "state": "open",
> "request_load_avg": 0,
> "uptime": 6632206.338307,
> "replay_requests": 0,
> "completed_requests": 0,
> "reconnecting": false,
> "inst": "client.30749150 192.168.56.136:0/2578719015",
> "client_metadata": {
> "features": "00000000000000ff",
> "entity_id": "con-fs2-hpc",
> "hostname": "sn009.hpc.ait.dtu.dk",
> "kernel_version": "3.10.0-957.12.2.el7.x86_64",
> "root": "/hpc/home"
> }
> },
>
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io