On Thu, 27 Feb 2020 at 06:27, Anthony D'Atri <aad(a)dreamsnake.net> wrote:
> If the heap stats reported by telling the OSD `heap stats` is large, telling each `heap release` may be useful. I suspect a TCMALLOC shortcoming.
osd.158 tcmalloc heap stats:------------------------------------------------
MALLOC: 5722761448 ( 5457.7 MiB) Bytes in use by application
MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist
MALLOC: + 311621760 ( 297.2 MiB) Bytes in central cache freelist
MALLOC: + 26242992 ( 25.0 MiB) Bytes in transfer cache freelist
MALLOC: + 62721768 ( 59.8 MiB) Bytes in thread cache freelists
MALLOC: + 113340608 ( 108.1 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 6236688576 ( 5947.8 MiB) Actual memory used (physical + swap)
MALLOC: + 21415870464 (20423.8 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 27652559040 (26371.5 MiB) Virtual address space used
MALLOC:
MALLOC: 394518 Spans in use
MALLOC: 37 Thread heaps in use
MALLOC: 8192 Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
Hello Guys,
Unfortunately, I''ve deleted some caps from client.admin and tried the following solution so set them back:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-January/015474.html
I’ve tried the following:
# ssh’d to a mon node and changed dir to the mon directory
cd /var/lib/ceph/mon/<monname>
# tried to authenticate with the monitor keyring and set the client.admin caps to give back full permissions
ceph -n mon.<monid> --keyring keyring auth caps client.admin mds 'allow *' osd 'allow *' mon 'allow *'
When i try to modify the client.admin caps, the command just hangs at the shell until i press ctrl-c, which gets aknowledged with "Cluster connection aborted”
Also i can’t see the the connection in the ceph.audit.log
Do i make a mistake here or is this "workaround" not supported anymore in nautilus (14.2.6)?
Regards
Sorry I am not really developer nor now details about POSIX/fs's. But if
I may ask, why are you getting this time from the parent?
-----Original Message-----
Sent: 11 March 2020 11:32
To: Jeff Layton
Cc: Marc Roos; ceph-users; ceph-devel(a)vger.kernel.org
Subject: Re: [ceph-users] cephfs snap mkdir strange timestamp
On Tue, Mar 10, 2020 at 01:39:29PM -0400, Jeff Layton wrote:
...
> > Signed-off-by: Luis Henriques <lhenriques(a)suse.com>
> > ---
> > fs/ceph/inode.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index
> > d01710a16a4a..f4e78ade0871 100644
> > --- a/fs/ceph/inode.c
> > +++ b/fs/ceph/inode.c
> > @@ -82,6 +82,8 @@ struct inode *ceph_get_snapdir(struct inode
*parent)
> > inode->i_mode = parent->i_mode;
> > inode->i_uid = parent->i_uid;
> > inode->i_gid = parent->i_gid;
> > + inode->i_mtime = parent->i_mtime;
> > + inode->i_ctime = parent->i_ctime;
> > inode->i_op = &ceph_snapdir_iops;
> > inode->i_fop = &ceph_snapdir_fops;
> > ci->i_snap_caps = CEPH_CAP_PIN; /* so we can open */
>
> What about the atime, and the ci->i_btime ?
Yeah, probably makes sense too, although the fuse client doesn't seem to
touch atime (it does change btime, I missed that). I'll send v2 in a
bit.
Cheers,
--
Luis
HI,
I'm sorry to bother you again
I want to use kafka to queue the notifications , I add a topic named kafka,and put the notification config xml
The topic info:
<ListTopicsResponse xmlns="https://sns.amazonaws.com/doc/2010-03-31/">
<ListTopicsResult>
<Topics>
<member>
<User>sr</User>
<Name>kafka</Name>
<EndPoint>
<EndpointAddress>kafka://192.168.3.250:9092</EndpointAddress>
<EndpointArgs>kafka-ack-level=broker&push-endpoint=kafka://192.168.3.250:9092</EndpointArgs>
<EndpointTopic>kafka</EndpointTopic>
</EndPoint>
<TopicArn>arn:aws:sns:default::kafka</TopicArn>
</member>
<member>
<User>sr</User>
<Name>kafka_kafka</Name>
<EndPoint>
<EndpointAddress>kafka://192.168.3.250:9092</EndpointAddress>
<EndpointArgs>kafka-ack-level=broker&push-endpoint=kafka://192.168.3.250:9092</EndpointArgs>
<EndpointTopic>kafka</EndpointTopic>
</EndPoint>
<TopicArn>arn:aws:sns:default::kafka</TopicArn>
</member>
<member>
<User>sr</User>
<Name>webno</Name>
<EndPoint>
<EndpointAddress>http://192.168.1.114:8080/s3/sn</EndpointAddress>
<EndpointArgs>push-endpoint=http://192.168.1.114:8080/s3/sn</EndpointArgs>
<EndpointTopic>webno</EndpointTopic>
</EndPoint>
<TopicArn>arn:aws:sns:default::webno</TopicArn>
</member>
</Topics>
</ListTopicsResult>
<ResponseMetadata>
<RequestId>c4b84c5b-1e88-4f16-9863-7f68872d91a4.744394.135</RequestId>
</ResponseMetadata>
</ListTopicsResponse>
and the put notification body :
<NotificationConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <TopicConfiguration> <Id>kafka</Id> <Topic>arn:aws:sns:default::kafka</Topic> </TopicConfiguration> </NotificationConfiguration>
The web notification works fine ,but when I use the kafka(version 1.0 jdk 1.8)
I got the debug info :
020-03-11 12:46:38.612 7fd81eeb1700 20 get_system_obj_state: s->obj_tag was set empty
2020-03-11 12:46:38.612 7fd81eeb1700 10 cache get: name=default.rgw.log++pubsub.user.sr.bucket.osstest/c4b84c5b-1e88-4f16-9863-7f68872d91a4.14175.1 : hit (requested=0x1, cached=0x17)
2020-03-11 12:46:38.612 7fd81eeb1700 20 notification: 'kafka' on topic: 'kafka' and bucket: 'osstest' (unique topic: 'kafka_kafka') apply to event of type: 's3:ObjectCreated:Put'
2020-03-11 12:46:38.612 7fd81eeb1700 1 ERROR: failed to create push endpoint: kafka://192.168.3.250:9092 due to: pubsub endpoint configuration error: unknown schema in: kafka://192.168.3.250:9092
2020-03-11 12:46:38.612 7fd81eeb1700 5 req 126 0.186s s3:put_obj WARNING: publishing notification failed, with error: -22
2020-03-11 12:46:38.612 7fd81eeb1700 2
Hello,
is there any way to reset deep-scrubbed time for pgs?
The cluster was accidently in state nodeep-scrub and is now unable to
deep scrub fast enough.
Is there any way to force mark all pgs as deep scrubbed to start from 0
again?
Greets,
Stefan
Hello List,
when i initially enable journal/mirror on an image it gets
bootstrapped to my site-b pretty quickly with 250MB/sec which is about
the IO Write limit.
Once its up2date, the replay is very slow. About 15KB/sec and the
entries_behind_maste is just running away:
root@ceph01:~# rbd --cluster backup mirror pool status rbd-cluster6 --verbose
health: OK
images: 3 total
3 replaying
...
vm-112-disk-0:
global_id: 60a795c3-9f5d-4be3-b9bd-3df971e531fa
state: up+replaying
description: replaying, master_position=[object_number=623,
tag_tid=3, entry_tid=345567], mirror_position=[object_number=35,
tag_tid=3, entry_tid=18371], entries_behind_master=327196
last_update: 2020-03-10 11:36:44
...
Write traffic on the source is about 20/25MB/sec.
On the Source i run 14.2.6 and on the destination 12.2.13.
Any idea why the replaying is sooo slow?
Thanks,
Michael
-----Original Message-----
From: Mail Delivery Subsystem [mailto:MAILER-DAEMON]
Sent: 10 March 2020 19:01
Subject: Warning: could not send message for past 4 hours
**********************************************
** THIS IS A WARNING MESSAGE ONLY **
** YOU DO NOT NEED TO RESEND YOUR MESSAGE **
**********************************************
The original message was received at Tue, 10 Mar 2020 15:00:07 +0100
from localhost.localdomain [127.0.0.1]
----- Transcript of session follows -----
451 croit.io: Name server timeout
Warning: message still undelivered after 4 hours Will keep trying until
message is 5 days old
There's an xattr for this: ceph.snap.btime IIRC
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
On Tue, Mar 10, 2020 at 11:42 AM Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote:
>
>
>
> If I make a directory in linux the directory has the date of now, why is
> this not with creating a snap dir? Is this not a bug? One expects this
> to be the same as in linux not????
>
> [ @ test]$ mkdir temp
>
> [ @os0 test]$ ls -arltn
> total 28
> drwxrwxrwt. 27 0 0 20480 Mar 10 11:38 ..
> drwxrwxr-x 2 801 801 4096 Mar 10 11:38 temp
> drwxrwxr-x 3 801 801 4096 Mar 10 11:38 .
>
>
> [ @ .snap]# mkdir test
> [ @ .snap]# ls -lartn
> total 0
> drwxr-xr-x 861886554 0 0 8390344070786420358 Jan 1 1970 .
> drwxr-xr-x 4 0 0 2 Mar 6 14:43 test
> drwxr-xr-x 4 0 0 2 Mar 6 14:43 snap-9
> drwxr-xr-x 4 0 0 2 Mar 6 14:43 snap-8
> drwxr-xr-x 4 0 0 2 Mar 6 14:43 snap-7
> drwxr-xr-x 4 0 0 2 Mar 6 14:43 snap-6
> drwxr-xr-x 4 0 0 2 Mar 6 14:43 snap-5
> drwxr-xr-x 4 0 0 2 Mar 6 14:43 snap-10
> drwxr-xr-x 4 0 0 2 Mar 6 14:43 ..
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
For testing purposes I changed the kernel 3.10 for a 5.5, now I am
getting these messages. I assume the 3.10 was just never displaying
these. Could this be a problem with my caps of the fs id user?
[Mon Mar 9 23:10:52 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar 9 23:12:03 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar 9 23:13:12 2020] ceph: Can't lookup inode 1 (err: -13)
[Mon Mar 9 23:14:19 2020] ceph: Can't lookup inode 1 (err: -13)