Stuck in replay?

List overview All Threads
Download

newer

older

Why CEPH is better than other...

s3 bucket policy subusers - access...

Erich Weiler

22 Apr 2024 22 Apr '24

10:47 p.m.

Hi All, We have a somewhat serious situation where we have a cephfs filesystem (18.2.1), and 2 active MDSs (one standby). ThI tried to restart one of the active daemons to unstick a bunch of blocked requests, and the standby went into 'replay' for a very long time, then RAM on that MDS server filled up, and it just stayed there for a while then eventually appeared to give up and switched to the standby, but the cycle started again. So I restarted that MDS, and now I'm in a situation where I see this: # ceph fs status slugfs - 29 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k 12.2k 0 1 resolve slugfs.pr-md-02.sbblqq 0 3 1 0 POOL TYPE USED AVAIL cephfs_metadata metadata 997G 2948G cephfs_md_and_data data 0 87.6T cephfs_data data 773T 175T STANDBY MDS slugfs.pr-md-03.mclckv MDS version: ceph version 18.2.1 (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) It just stays there indefinitely. All my clients are hung. I tried restarting all MDS daemons and they just went back to this state after coming back up. Is there any way I can somehow escape this state of indefinite replay/resolve? Thanks so much! I'm kinda nervous since none of my clients have filesystem access at the moment... cheers, erich

Show replies by date

Erich Weiler

22 Apr 22 Apr

11:03 p.m.

I also see this from 'ceph health detail': # ceph health detail HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache; 1 MDSs behind on trimming [WRN] FS_DEGRADED: 1 filesystem is degraded fs slugfs is degraded [WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large (19GB/8GB); 0 inodes in use by clients, 0 stray files [WRN] MDS_TRIM: 1 MDSs behind on trimming mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (127084/250) max_segments: 250, num_segments: 127084 MDS cache too large? The mds process is taking up 22GB right now and starting to swap my server, so maybe it somehow is too large.... On 4/22/24 11:17 AM, Erich Weiler wrote: > Hi All, > > We have a somewhat serious situation where we have a cephfs filesystem > (18.2.1), and 2 active MDSs (one standby). ThI tried to restart one of > the active daemons to unstick a bunch of blocked requests, and the > standby went into 'replay' for a very long time, then RAM on that MDS > server filled up, and it just stayed there for a while then eventually > appeared to give up and switched to the standby, but the cycle started > again. So I restarted that MDS, and now I'm in a situation where I see > this: > > # ceph fs status > slugfs - 29 clients > ====== > RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS > 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k 12.2k 0 > 1 resolve slugfs.pr-md-02.sbblqq 0 3 1 0 > POOL TYPE USED AVAIL > cephfs_metadata metadata 997G 2948G > cephfs_md_and_data data 0 87.6T > cephfs_data data 773T 175T > STANDBY MDS > slugfs.pr-md-03.mclckv > MDS version: ceph version 18.2.1 > (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) > > It just stays there indefinitely. All my clients are hung. I tried > restarting all MDS daemons and they just went back to this state after > coming back up. > > Is there any way I can somehow escape this state of indefinite > replay/resolve? > > Thanks so much! I'm kinda nervous since none of my clients have > filesystem access at the moment... > > cheers, > erich

Sake Ceph

11:07 p.m.

Just a question: is it possible to block or disable all clients? Just to prevent load on the system. Kind regards, Sake > Op 22-04-2024 20:33 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: > > > I also see this from 'ceph health detail': > > # ceph health detail > HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache; 1 > MDSs behind on trimming > [WRN] FS_DEGRADED: 1 filesystem is degraded > fs slugfs is degraded > [WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache > mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large > (19GB/8GB); 0 inodes in use by clients, 0 stray files > [WRN] MDS_TRIM: 1 MDSs behind on trimming > mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (127084/250) > max_segments: 250, num_segments: 127084 > > MDS cache too large? The mds process is taking up 22GB right now and > starting to swap my server, so maybe it somehow is too large.... > > On 4/22/24 11:17 AM, Erich Weiler wrote: > > Hi All, > > > > We have a somewhat serious situation where we have a cephfs filesystem > > (18.2.1), and 2 active MDSs (one standby). ThI tried to restart one of > > the active daemons to unstick a bunch of blocked requests, and the > > standby went into 'replay' for a very long time, then RAM on that MDS > > server filled up, and it just stayed there for a while then eventually > > appeared to give up and switched to the standby, but the cycle started > > again. So I restarted that MDS, and now I'm in a situation where I see > > this: > > > > # ceph fs status > > slugfs - 29 clients > > ====== > > RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS > > 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k 12.2k 0 > > 1 resolve slugfs.pr-md-02.sbblqq 0 3 1 0 > > POOL TYPE USED AVAIL > > cephfs_metadata metadata 997G 2948G > > cephfs_md_and_data data 0 87.6T > > cephfs_data data 773T 175T > > STANDBY MDS > > slugfs.pr-md-03.mclckv > > MDS version: ceph version 18.2.1 > > (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) > > > > It just stays there indefinitely. All my clients are hung. I tried > > restarting all MDS daemons and they just went back to this state after > > coming back up. > > > > Is there any way I can somehow escape this state of indefinite > > replay/resolve? > > > > Thanks so much! I'm kinda nervous since none of my clients have > > filesystem access at the moment... > > > > cheers, > > erich > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

Erich Weiler

11:11 p.m.

possibly but it would be pretty time consuming and difficult... Is it maybe a RAM issue since my MDS RAM is filling up? Should maybe I bring up another MDS on another server with huge amount of RAM and move the MDS there in hopes it will have enough RAM to complete the replay? On 4/22/24 11:37 AM, Sake Ceph wrote: > Just a question: is it possible to block or disable all clients? Just to prevent load on the system. > > Kind regards, > Sake >> Op 22-04-2024 20:33 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: >> >> >> I also see this from 'ceph health detail': >> >> # ceph health detail >> HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache; 1 >> MDSs behind on trimming >> [WRN] FS_DEGRADED: 1 filesystem is degraded >> fs slugfs is degraded >> [WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache >> mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large >> (19GB/8GB); 0 inodes in use by clients, 0 stray files >> [WRN] MDS_TRIM: 1 MDSs behind on trimming >> mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (127084/250) >> max_segments: 250, num_segments: 127084 >> >> MDS cache too large? The mds process is taking up 22GB right now and >> starting to swap my server, so maybe it somehow is too large.... >> >> On 4/22/24 11:17 AM, Erich Weiler wrote: >>> Hi All, >>> >>> We have a somewhat serious situation where we have a cephfs filesystem >>> (18.2.1), and 2 active MDSs (one standby). ThI tried to restart one of >>> the active daemons to unstick a bunch of blocked requests, and the >>> standby went into 'replay' for a very long time, then RAM on that MDS >>> server filled up, and it just stayed there for a while then eventually >>> appeared to give up and switched to the standby, but the cycle started >>> again. So I restarted that MDS, and now I'm in a situation where I see >>> this: >>> >>> # ceph fs status >>> slugfs - 29 clients >>> ====== >>> RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS >>> 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k 12.2k 0 >>> 1 resolve slugfs.pr-md-02.sbblqq 0 3 1 0 >>> POOL TYPE USED AVAIL >>> cephfs_metadata metadata 997G 2948G >>> cephfs_md_and_data data 0 87.6T >>> cephfs_data data 773T 175T >>> STANDBY MDS >>> slugfs.pr-md-03.mclckv >>> MDS version: ceph version 18.2.1 >>> (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) >>> >>> It just stays there indefinitely. All my clients are hung. I tried >>> restarting all MDS daemons and they just went back to this state after >>> coming back up. >>> >>> Is there any way I can somehow escape this state of indefinite >>> replay/resolve? >>> >>> Thanks so much! I'm kinda nervous since none of my clients have >>> filesystem access at the moment... >>> >>> cheers, >>> erich >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io

Eugen Block

23 Apr 23 Apr

12:16 a.m.

IIRC, you have 8 GB configured for the mds cache memory limit, and it doesn’t seem to be enough. Does the host run into oom killer as well? But it’s definitely a good approach to increase the cache limit (try 24 GB if possible since it’s trying to use at least 19 GB) on a host with enough RAM. Or can you pinpoint a single client maybe? Zitat von Erich Weiler <weiler(a)soe.ucsc.edu>du>:

...

Just a question: is it possible to block or disable all clients? Just to prevent load on the system. Kind regards, Sake > Op 22-04-2024 20:33 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: > > I also see this from 'ceph health detail': > > # ceph health detail > HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache; 1 > MDSs behind on trimming > [WRN] FS_DEGRADED: 1 filesystem is degraded > fs slugfs is degraded > [WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache > mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large > (19GB/8GB); 0 inodes in use by clients, 0 stray files > [WRN] MDS_TRIM: 1 MDSs behind on trimming > mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (127084/250) > max_segments: 250, num_segments: 127084 > > MDS cache too large? The mds process is taking up 22GB right now and > starting to swap my server, so maybe it somehow is too large.... > > On 4/22/24 11:17 AM, Erich Weiler wrote: >> Hi All, >> >> We have a somewhat serious situation where we have a cephfs filesystem >> (18.2.1), and 2 active MDSs (one standby). ThI tried to restart one of >> the active daemons to unstick a bunch of blocked requests, and the >> standby went into 'replay' for a very long time, then RAM on that MDS >> server filled up, and it just stayed there for a while then eventually >> appeared to give up and switched to the standby, but the cycle started >> again. So I restarted that MDS, and now I'm in a situation where I see >> this: >> >> # ceph fs status >> slugfs - 29 clients >> ====== >> RANK STATE MDS ACTIVITY DNS INOS >> DIRS CAPS >> 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k >> 12.2k 0 >> 1 resolve slugfs.pr-md-02.sbblqq 0 >> 3 1 0 >> POOL TYPE USED AVAIL >> cephfs_metadata metadata 997G 2948G >> cephfs_md_and_data data 0 87.6T >> cephfs_data data 773T 175T >> STANDBY MDS >> slugfs.pr-md-03.mclckv >> MDS version: ceph version 18.2.1 >> (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) >> >> It just stays there indefinitely. All my clients are hung. I tried >> restarting all MDS daemons and they just went back to this state after >> coming back up. >> >> Is there any way I can somehow escape this state of indefinite >> replay/resolve? >> >> Thanks so much! I'm kinda nervous since none of my clients have >> filesystem access at the moment... >> >> cheers, >> erich > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

_______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Erich Weiler

12:20 a.m.

I was able to start another MDS daemon on another node that had 512GB RAM, and then the active MDS eventually migrated there, and went through the replay (which consumed about 100 GB of RAM), and then things recovered. Phew. I guess I need significantly more RAM in my MDS servers... I had no idea the MDS daemon could require that much RAM. -erich On 4/22/24 11:41 AM, Erich Weiler wrote: > possibly but it would be pretty time consuming and difficult... > > Is it maybe a RAM issue since my MDS RAM is filling up? Should maybe I > bring up another MDS on another server with huge amount of RAM and move > the MDS there in hopes it will have enough RAM to complete the replay? > > On 4/22/24 11:37 AM, Sake Ceph wrote: >> Just a question: is it possible to block or disable all clients? Just >> to prevent load on the system. >> >> Kind regards, >> Sake >>> Op 22-04-2024 20:33 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: >>> >>> I also see this from 'ceph health detail': >>> >>> # ceph health detail >>> HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache; 1 >>> MDSs behind on trimming >>> [WRN] FS_DEGRADED: 1 filesystem is degraded >>> fs slugfs is degraded >>> [WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache >>> mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large >>> (19GB/8GB); 0 inodes in use by clients, 0 stray files >>> [WRN] MDS_TRIM: 1 MDSs behind on trimming >>> mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (127084/250) >>> max_segments: 250, num_segments: 127084 >>> >>> MDS cache too large? The mds process is taking up 22GB right now and >>> starting to swap my server, so maybe it somehow is too large.... >>> >>> On 4/22/24 11:17 AM, Erich Weiler wrote: >>>> Hi All, >>>> >>>> We have a somewhat serious situation where we have a cephfs filesystem >>>> (18.2.1), and 2 active MDSs (one standby). ThI tried to restart one of >>>> the active daemons to unstick a bunch of blocked requests, and the >>>> standby went into 'replay' for a very long time, then RAM on that MDS >>>> server filled up, and it just stayed there for a while then eventually >>>> appeared to give up and switched to the standby, but the cycle started >>>> again. So I restarted that MDS, and now I'm in a situation where I see >>>> this: >>>> >>>> # ceph fs status >>>> slugfs - 29 clients >>>> ====== >>>> RANK STATE MDS ACTIVITY DNS INOS >>>> DIRS CAPS >>>> 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k >>>> 12.2k 0 >>>> 1 resolve slugfs.pr-md-02.sbblqq 0 3 >>>> 1 0 >>>> POOL TYPE USED AVAIL >>>> cephfs_metadata metadata 997G 2948G >>>> cephfs_md_and_data data 0 87.6T >>>> cephfs_data data 773T 175T >>>> STANDBY MDS >>>> slugfs.pr-md-03.mclckv >>>> MDS version: ceph version 18.2.1 >>>> (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) >>>> >>>> It just stays there indefinitely. All my clients are hung. I tried >>>> restarting all MDS daemons and they just went back to this state after >>>> coming back up. >>>> >>>> Is there any way I can somehow escape this state of indefinite >>>> replay/resolve? >>>> >>>> Thanks so much! I'm kinda nervous since none of my clients have >>>> filesystem access at the moment... >>>> >>>> cheers, >>>> erich >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users(a)ceph.io >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io

Sake Ceph

2 a.m.

100 GB of Ram! Damn that's a lot for a filesystem in my opinion, or am I wrong? Kind regards, Sake > Op 22-04-2024 21:50 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: > > > I was able to start another MDS daemon on another node that had 512GB > RAM, and then the active MDS eventually migrated there, and went through > the replay (which consumed about 100 GB of RAM), and then things > recovered. Phew. I guess I need significantly more RAM in my MDS > servers... I had no idea the MDS daemon could require that much RAM. > > -erich > > On 4/22/24 11:41 AM, Erich Weiler wrote: > > possibly but it would be pretty time consuming and difficult... > > > > Is it maybe a RAM issue since my MDS RAM is filling up? Should maybe I > > bring up another MDS on another server with huge amount of RAM and move > > the MDS there in hopes it will have enough RAM to complete the replay? > > > > On 4/22/24 11:37 AM, Sake Ceph wrote: > >> Just a question: is it possible to block or disable all clients? Just > >> to prevent load on the system. > >> > >> Kind regards, > >> Sake > >>> Op 22-04-2024 20:33 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: > >>> > >>> I also see this from 'ceph health detail': > >>> > >>> # ceph health detail > >>> HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache; 1 > >>> MDSs behind on trimming > >>> [WRN] FS_DEGRADED: 1 filesystem is degraded > >>> fs slugfs is degraded > >>> [WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache > >>> mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large > >>> (19GB/8GB); 0 inodes in use by clients, 0 stray files > >>> [WRN] MDS_TRIM: 1 MDSs behind on trimming > >>> mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming (127084/250) > >>> max_segments: 250, num_segments: 127084 > >>> > >>> MDS cache too large? The mds process is taking up 22GB right now and > >>> starting to swap my server, so maybe it somehow is too large.... > >>> > >>> On 4/22/24 11:17 AM, Erich Weiler wrote: > >>>> Hi All, > >>>> > >>>> We have a somewhat serious situation where we have a cephfs filesystem > >>>> (18.2.1), and 2 active MDSs (one standby). ThI tried to restart one of > >>>> the active daemons to unstick a bunch of blocked requests, and the > >>>> standby went into 'replay' for a very long time, then RAM on that MDS > >>>> server filled up, and it just stayed there for a while then eventually > >>>> appeared to give up and switched to the standby, but the cycle started > >>>> again. So I restarted that MDS, and now I'm in a situation where I see > >>>> this: > >>>> > >>>> # ceph fs status > >>>> slugfs - 29 clients > >>>> ====== > >>>> RANK STATE MDS ACTIVITY DNS INOS > >>>> DIRS CAPS > >>>> 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k > >>>> 12.2k 0 > >>>> 1 resolve slugfs.pr-md-02.sbblqq 0 3 > >>>> 1 0 > >>>> POOL TYPE USED AVAIL > >>>> cephfs_metadata metadata 997G 2948G > >>>> cephfs_md_and_data data 0 87.6T > >>>> cephfs_data data 773T 175T > >>>> STANDBY MDS > >>>> slugfs.pr-md-03.mclckv > >>>> MDS version: ceph version 18.2.1 > >>>> (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) > >>>> > >>>> It just stays there indefinitely. All my clients are hung. I tried > >>>> restarting all MDS daemons and they just went back to this state after > >>>> coming back up. > >>>> > >>>> Is there any way I can somehow escape this state of indefinite > >>>> replay/resolve? > >>>> > >>>> Thanks so much! I'm kinda nervous since none of my clients have > >>>> filesystem access at the moment... > >>>> > >>>> cheers, > >>>> erich > >>> _______________________________________________ > >>> ceph-users mailing list -- ceph-users(a)ceph.io > >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io

Lars Köppel

1:04 p.m.

Hi Erich, great that you recovered from this. It sounds like you had the same problem I had a few months ago. mds crashes after up:replay state - ceph-users - lists.ceph.io <https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/IRV6K74GWE2SWAWQZVUDQAPSMY4J4R4D/#UBPVYXO5ODABKUZ436HN4WBX7QJUXY3P> Kind regards, Lars [image: ariadne.ai Logo] Lars Köppel Developer Email: lars.koeppel(a)ariadne.ai Phone: +49 6221 5993580 <+4962215993580> ariadne.ai (Germany) GmbH Häusserstraße 3, 69115 Heidelberg Amtsgericht Mannheim, HRB 744040 Geschäftsführer: Dr. Fabian Svara https://ariadne.ai On Mon, Apr 22, 2024 at 11:31 PM Sake Ceph <ceph(a)paulusma.eu> wrote:

...

100 GB of Ram! Damn that's a lot for a filesystem in my opinion, or am I wrong? Kind regards, Sake

Op 22-04-2024 21:50 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: I was able to start another MDS daemon on another node that had 512GB RAM, and then the active MDS eventually migrated there, and went through the replay (which consumed about 100 GB of RAM), and then things recovered. Phew. I guess I need significantly more RAM in my MDS servers... I had no idea the MDS daemon could require that much RAM. -erich On 4/22/24 11:41 AM, Erich Weiler wrote: > possibly but it would be pretty time consuming and difficult... > > Is it maybe a RAM issue since my MDS RAM is filling up? Should maybe

> bring up another MDS on another server with huge amount of RAM and

move

> the MDS there in hopes it will have enough RAM to complete the replay? > > On 4/22/24 11:37 AM, Sake Ceph wrote: >> Just a question: is it possible to block or disable all clients? Just >> to prevent load on the system. >> >> Kind regards, >> Sake >>> Op 22-04-2024 20:33 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: >>> >>> I also see this from 'ceph health detail': >>> >>> # ceph health detail >>> HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache;

>>> MDSs behind on trimming >>> [WRN] FS_DEGRADED: 1 filesystem is degraded >>> fs slugfs is degraded >>> [WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache >>> mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large >>> (19GB/8GB); 0 inodes in use by clients, 0 stray files >>> [WRN] MDS_TRIM: 1 MDSs behind on trimming >>> mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming

(127084/250)

>>> max_segments: 250, num_segments: 127084 >>> >>> MDS cache too large? The mds process is taking up 22GB right now and >>> starting to swap my server, so maybe it somehow is too large.... >>> >>> On 4/22/24 11:17 AM, Erich Weiler wrote: >>>> Hi All, >>>> >>>> We have a somewhat serious situation where we have a cephfs

filesystem

>>>> (18.2.1), and 2 active MDSs (one standby). ThI tried to restart

one of

>>>> the active daemons to unstick a bunch of blocked requests, and the >>>> standby went into 'replay' for a very long time, then RAM on that

MDS

>>>> server filled up, and it just stayed there for a while then

eventually

>>>> appeared to give up and switched to the standby, but the cycle

started

>>>> again. So I restarted that MDS, and now I'm in a situation where I

see

>>>> this: >>>> >>>> # ceph fs status >>>> slugfs - 29 clients >>>> ====== >>>> RANK STATE MDS ACTIVITY DNS INOS >>>> DIRS CAPS >>>> 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k >>>> 12.2k 0 >>>> 1 resolve slugfs.pr-md-02.sbblqq 0 3 >>>> 1 0 >>>> POOL TYPE USED AVAIL >>>> cephfs_metadata metadata 997G 2948G >>>> cephfs_md_and_data data 0 87.6T >>>> cephfs_data data 773T 175T >>>> STANDBY MDS >>>> slugfs.pr-md-03.mclckv >>>> MDS version: ceph version 18.2.1 >>>> (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) >>>> >>>> It just stays there indefinitely. All my clients are hung. I tried >>>> restarting all MDS daemons and they just went back to this state

after

>>>> coming back up. >>>> >>>> Is there any way I can somehow escape this state of indefinite >>>> replay/resolve? >>>> >>>> Thanks so much! I'm kinda nervous since none of my clients have >>>> filesystem access at the moment... >>>> >>>> cheers, >>>> erich >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users(a)ceph.io >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io

_______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

David Yang

3:15 p.m.

Hi Erich When mds cache usage is very high, recovery is very slow. So I use command to drop mds cache: ceph tell mds.* cache drop 600 Lars Köppel <lars.koeppel(a)ariadne.ai> 于2024年4月23日周二 16:36写道： > > Hi Erich, > > great that you recovered from this. > It sounds like you had the same problem I had a few months ago. > mds crashes after up:replay state - ceph-users - lists.ceph.io > <https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/IRV6K74GWE2SWAWQZVUDQAPSMY4J4R4D/#UBPVYXO5ODABKUZ436HN4WBX7QJUXY3P> > > Kind regards, > Lars > > > [image: ariadne.ai Logo] Lars Köppel > Developer > Email: lars.koeppel(a)ariadne.ai > Phone: +49 6221 5993580 <+4962215993580> > ariadne.ai (Germany) GmbH > Häusserstraße 3, 69115 Heidelberg > Amtsgericht Mannheim, HRB 744040 > Geschäftsführer: Dr. Fabian Svara > https://ariadne.ai > > > On Mon, Apr 22, 2024 at 11:31 PM Sake Ceph <ceph(a)paulusma.eu> wrote: > > > 100 GB of Ram! Damn that's a lot for a filesystem in my opinion, or am I > > wrong? > > > > Kind regards, > > Sake > > > > > Op 22-04-2024 21:50 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: > > > > > > > > > I was able to start another MDS daemon on another node that had 512GB > > > RAM, and then the active MDS eventually migrated there, and went through > > > the replay (which consumed about 100 GB of RAM), and then things > > > recovered. Phew. I guess I need significantly more RAM in my MDS > > > servers... I had no idea the MDS daemon could require that much RAM. > > > > > > -erich > > > > > > On 4/22/24 11:41 AM, Erich Weiler wrote: > > > > possibly but it would be pretty time consuming and difficult... > > > > > > > > Is it maybe a RAM issue since my MDS RAM is filling up? Should maybe > > I > > > > bring up another MDS on another server with huge amount of RAM and > > move > > > > the MDS there in hopes it will have enough RAM to complete the replay? > > > > > > > > On 4/22/24 11:37 AM, Sake Ceph wrote: > > > >> Just a question: is it possible to block or disable all clients? Just > > > >> to prevent load on the system. > > > >> > > > >> Kind regards, > > > >> Sake > > > >>> Op 22-04-2024 20:33 CEST schreef Erich Weiler <weiler(a)soe.ucsc.edu>du>: > > > >>> > > > >>> I also see this from 'ceph health detail': > > > >>> > > > >>> # ceph health detail > > > >>> HEALTH_WARN 1 filesystem is degraded; 1 MDSs report oversized cache; > > 1 > > > >>> MDSs behind on trimming > > > >>> [WRN] FS_DEGRADED: 1 filesystem is degraded > > > >>> fs slugfs is degraded > > > >>> [WRN] MDS_CACHE_OVERSIZED: 1 MDSs report oversized cache > > > >>> mds.slugfs.pr-md-01.xdtppo(mds.0): MDS cache is too large > > > >>> (19GB/8GB); 0 inodes in use by clients, 0 stray files > > > >>> [WRN] MDS_TRIM: 1 MDSs behind on trimming > > > >>> mds.slugfs.pr-md-01.xdtppo(mds.0): Behind on trimming > > (127084/250) > > > >>> max_segments: 250, num_segments: 127084 > > > >>> > > > >>> MDS cache too large? The mds process is taking up 22GB right now and > > > >>> starting to swap my server, so maybe it somehow is too large.... > > > >>> > > > >>> On 4/22/24 11:17 AM, Erich Weiler wrote: > > > >>>> Hi All, > > > >>>> > > > >>>> We have a somewhat serious situation where we have a cephfs > > filesystem > > > >>>> (18.2.1), and 2 active MDSs (one standby). ThI tried to restart > > one of > > > >>>> the active daemons to unstick a bunch of blocked requests, and the > > > >>>> standby went into 'replay' for a very long time, then RAM on that > > MDS > > > >>>> server filled up, and it just stayed there for a while then > > eventually > > > >>>> appeared to give up and switched to the standby, but the cycle > > started > > > >>>> again. So I restarted that MDS, and now I'm in a situation where I > > see > > > >>>> this: > > > >>>> > > > >>>> # ceph fs status > > > >>>> slugfs - 29 clients > > > >>>> ====== > > > >>>> RANK STATE MDS ACTIVITY DNS INOS > > > >>>> DIRS CAPS > > > >>>> 0 replay slugfs.pr-md-01.xdtppo 3958k 57.1k > > > >>>> 12.2k 0 > > > >>>> 1 resolve slugfs.pr-md-02.sbblqq 0 3 > > > >>>> 1 0 > > > >>>> POOL TYPE USED AVAIL > > > >>>> cephfs_metadata metadata 997G 2948G > > > >>>> cephfs_md_and_data data 0 87.6T > > > >>>> cephfs_data data 773T 175T > > > >>>> STANDBY MDS > > > >>>> slugfs.pr-md-03.mclckv > > > >>>> MDS version: ceph version 18.2.1 > > > >>>> (7fe91d5d5842e04be3b4f514d6dd990c54b29c76) reef (stable) > > > >>>> > > > >>>> It just stays there indefinitely. All my clients are hung. I tried > > > >>>> restarting all MDS daemons and they just went back to this state > > after > > > >>>> coming back up. > > > >>>> > > > >>>> Is there any way I can somehow escape this state of indefinite > > > >>>> replay/resolve? > > > >>>> > > > >>>> Thanks so much! I'm kinda nervous since none of my clients have > > > >>>> filesystem access at the moment... > > > >>>> > > > >>>> cheers, > > > >>>> erich > > > >>> _______________________________________________ > > > >>> ceph-users mailing list -- ceph-users(a)ceph.io > > > >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io > > _______________________________________________ > > ceph-users mailing list -- ceph-users(a)ceph.io > > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

days inactive

days old

ceph-users@ceph.io

Manage subscription

8 comments

5 participants

tags (0)

participants (5)

David Yang
Erich Weiler
Eugen Block
Lars Köppel
Sake Ceph