Hi,
I am using ceph mimic in a small test setup using the below configuration.
OS: ubuntu 18.04
1 node running (mon,mds,mgr) + 4 core cpu and 4GB RAM and 1 Gb lan
3 nodes each having 2 osd's, disks are 2TB + 2 core cpu and 4G RAM and 1
Gb lan
1 node acting as cephfs client + 2 core cpu and 4G RAM and 1 Gb lan
configured cephfs_metadata_pool (3 replica) and cephfs_data_pool erasure
2+1.
When running a script doing multiple folders creation ceph started throwing
error late IO due to high metadata workload.
once after folder creation complete PG's degraded and I am waiting for PG
to complete recovery but my OSD's starting to crash due to OOM and
restarting after some time.
Now my question is I can wait for recovery to complete but how do I stop
OOM and OSD crash? basically want to know the way to control memory usage
during recovery and make it stable.
I have also set very low PG metadata_pool 8 and data_pool 16.
I have already set "mon osd memory target to 1Gb" and I have set
max-backfill from 1 to 8.
Attached msg from "kern.log" from one of the node and snippet of error msg
in this mail.
---------error msg snippet ----------
-bash: fork: Cannot allocate memory
Sep 18 19:01:57 test-node1 kernel: [341246.765644] msgr-worker-0 invoked
oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null),
order=0, oom_score_adj=0
Sep 18 19:02:00 test-node1 kernel: [341246.765645] msgr-worker-0 cpuset=/
mems_allowed=0
Sep 18 19:02:00 test-node1 kernel: [341246.765650] CPU: 1 PID: 1737 Comm:
msgr-worker-0 Not tainted 4.15.0-45-generic #48-Ubuntu
Sep 18 19:02:02 test-node1 kernel: [341246.765833] Out of memory: Kill
process 1727 (ceph-osd) score 489 or sacrifice child
Sep 18 19:02:03 test-node1 kernel: [341246.765919] Killed process 1727
(ceph-osd) total-vm:3483844kB, anon-rss:1992708kB, file-rss:0kB,
shmem-rss:0kB
Sep 18 19:02:03 test-node1 kernel: [341246.899395] oom_reaper: reaped
process 1727 (ceph-osd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Sep 18 22:09:57 test-node1 kernel: [352529.433155] perf: interrupt took too
long (4965 > 4938), lowering kernel.perf_event_max_sample_rate to 40250
regards
Amudhan
https://www.thegeekdiary.com/centos-rhel-67-why-the-files-in-tmp-directory-…
Am 24.09.19 um 14:53 schrieb Lenz Grimmer:
> On 9/24/19 1:37 PM, Miha Verlic wrote:
>
>> I've got slightly different problem. After a few days of running fine,
>> dashboard stops working because it is apparently seeking for wrong
>> certificate file in /tmp. If I restart ceph-mgr it starts to work again.
> Does the restart trigger the creation of a similar-looking file in /tmp?
> I wonder if there's some kind of cron job that cleans up the /tmp
> directory every now and then...
>
> Lenz
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users(a)lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Volker Theile
Software Engineer | Ceph | openATTIC
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
GF: Felix Imendörffer, HRB 247165 (AG München)
Phone: +49 173 5876879
E-Mail: vtheile(a)suse.com
Hi everyone,
I'm configurating ISCSI gateway in Ceph Mimic (13.2.6) using ceph manual:
https://docs.ceph.com/docs/mimic/rbd/iscsi-target-cli/
But i stopped in this problem: In manual says:
"Set the client’s CHAP username to myiscsiusername and password to
myiscsipassword:
> /iscsi-target...at:rh7-client> auth chap=myiscsiusername/myiscsipassword"
But I receive this response:
/iscsi-target...at:rh7-client> auth chap=myiscsitest/myiscsitestpasswd
Unexpected keyword parameter 'chap'.
The options disponibles are:
/iscsi-target...at:rh7-client> auth ?
To set authentication, specify username=<user> password=<password>
[mutual_username]=<user> [mutual_password]=<password>
But if configure as asks:
auth username=myiscsitest password=myiscsitestpasswd
Failed to update the client's auth: Invalid password
I tried with high password complexibility, but the problem persists.
My questions:
- How is the correct mode for configure authentication?
- How contribute for update of documentation? A bug report has opened* for
broken information of instalation of ceph-iscsi-gw, but was closed without
update of documentation:https://github.com/ceph/ceph-ansible/issues/2707
Regards
Gesiel Bernardeds
Hi ceph experts,
I deployed Nautilus (v14.2.4) and Luminous (v12.2.11) on the same hardware, and made a rough performance comparison. The result seems Luminous is much better, which is unexpected.
My setup:
3 servers, each has 3 HDD OSDs, 1 SSD as DB, two separated 1G network for cluster and public.
Pool test has 32 pg and pop numbers, replicated size is 3.
Using "rados -p bench 80 write” to measure write performance.
The result:
Luminous: Average IOPS 36
Nautilus: Average IOPS 28
Is the difference considered valid for Nautilus?
Br,
Xu Yun
As the signature shows, please send an email to ceph-users-leave(a)ceph.io
for unsubscribing.
hou guanghua wrote:
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
Hi Alberto,
Did you tried with the "--addv" option? Here's an example:
monmaptool --addv <mon.id>
[v2:<ip_address>:<v2_port>,v1:<ip_address>:<v1_port>]
Cheers,
Ricardo Dias
On 23/09/19 16:07, Corona, Alberto wrote:
> Hi folks,
>
> While practicing some disaster recovery I noticed that it currently seems impossible to add both a v1 and v2 monitor to a monmap using monmaptool. Is there any way to create a monmap manually to include both protocol versions to a monmap? Currently on Ceph version 14.2.4
>
>
> - Alberto Corona
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users(a)lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
We are planning our ceph architecture and I have a question:
How should NVMe drives be used when our spinning storage devices use
bluestore:
1. block WAL and DB partitions
(https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-re…)
2. Cache tier
(https://docs.ceph.com/docs/nautilus/rados/operations/cache-tiering/)
3. Something else?
Hardware- Each node has:
3x 8 TB HDD
1x 450 GB NVMe drive
192 GB RAM
2x Xeon CPUs (24 cores total)
I plan to have three OSD daemons running on the node. There are 95 nodes
total with the same hardware.
Use Case:
The plan is create cephfs and use this filesystem to store people's home
directories and data. I anticipate more read operations than writes.
Regarding cache tiering: The online documentation says cache tiering
will often degrade performance. But when I read various threads on this
ML there do seem to be people using cache tiering with success. I do see
that it is heavily dependent upon one's use-case. In 2019 is there any
updated recommendations as to whether to use cache tiering?
If there is a third suggestion that people have I would be interested in
hearing it. Thanks in advance.
Sincerely,
Shawn Kwang
On Mon, 23 Sep 2019, Koebbe, Brian wrote:
> Thanks Sage!
> ceph osd dump: https://pastebin.com/raw/zLPz9DQg
>
>
> ceph-monstore-tool /var/lib/ceph/mon/ceph-ufm03 dump-keys |grep osd_snap| cut -c-29 |uniq -c
> 2 osd_snap / purged_snap_10_000
> 1 osd_snap / purged_snap_12_000
> 75 osd_snap / purged_snap_13_000
> 4 osd_snap / purged_snap_14_000
> 778106 osd_snap / purged_snap_15_000
> 861 osd_snap / purged_snap_1_0000
> 323 osd_snap / purged_snap_4_0000
> 88 osd_snap / purged_snap_5_0000
> 2 osd_snap / purged_snap_7_0000
> 2 osd_snap / purged_snap_8_0000
> 2 osd_snap / removed_epoch_10_0
> 1 osd_snap / removed_epoch_12_0
> 75 osd_snap / removed_epoch_13_0
> 4 osd_snap / removed_epoch_14_0
> 2316417 osd_snap / removed_epoch_15_0
> 970 osd_snap / removed_epoch_1_00
> 324 osd_snap / removed_epoch_4_00
> 89 osd_snap / removed_epoch_5_00
> 2 osd_snap / removed_epoch_7_00
> 2 osd_snap / removed_epoch_8_00
> 2 osd_snap / removed_snap_10_00
> 1 osd_snap / removed_snap_12_00
> 75 osd_snap / removed_snap_13_00
> 4 osd_snap / removed_snap_14_00
> 2720849 osd_snap / removed_snap_15_00
> 1161 osd_snap / removed_snap_1_000
> 379 osd_snap / removed_snap_4_000
> 89 osd_snap / removed_snap_5_000
> 2 osd_snap / removed_snap_7_000
> 2 osd_snap / removed_snap_8_000
Thanks! I opened a ticket at https://tracker.ceph.com/issues/42012. Can
you do the aboe dump with the 'grep osd_snap' only and attach that to the
bug?
This code has reworekd/improved in master, but I'm not sure the behavior
of keeping a full record of past span deletions was changed, so we may
need to make further improvements for octopus.
Thanks!
sage
> ________________________________
> From: Sage Weil <sage(a)newdream.net>
> Sent: Monday, September 23, 2019 9:41 AM
> To: Koebbe, Brian <koebbe(a)wustl.edu>
> Cc: ceph-users(a)ceph.io <ceph-users(a)ceph.io>; dev(a)ceph.io <dev(a)ceph.io>
> Subject: Re: [ceph-users] Seemingly unbounded osd_snap keys in monstore. Normal? Expected?
>
> Hi,
>
> On Mon, 23 Sep 2019, Koebbe, Brian wrote:
> > Our cluster has a little over 100 RBDs. Each RBD is snapshotted with a typical "frequently", hourly, daily, monthly type of schedule.
> > A while back a 4th monitor was temporarily added to the cluster that took hours to synchronize with the other 3.
> > While trying to figure out why that addition took so long, we discovered that our monitors have what seems like a really large number of osd_snap keys:
> >
> > ceph-monstore-tool /var/lib/ceph/mon/xxxxxx dump-keys |awk '{print $1}'|uniq -c
> > 153 auth
> > 2 config
> > 10 health
> > 1441 logm
> > 3 mdsmap
> > 313 mgr
> > 1 mgr_command_descs
> > 3 mgr_metadata
> > 163 mgrstat
> > 1 mkfs
> > 323 mon_config_key
> > 1 mon_sync
> > 6 monitor
> > 1 monitor_store
> > 32 monmap
> > 120 osd_metadata
> > 1 osd_pg_creating
> > 5818618 osd_snap
> > 41338 osdmap
> > 754 paxos
> >
> > A few questions:
> >
> > Could this be the cause of the slow addition/synchronization?
>
> Probably!
>
> > Is what looks like an unbounded number of osd_snaps expected?
>
> Maybe. Can you send me the output of 'ceph osd dump'? Also, if you don't
> mind doing the dump above and grepping out just the osd_snap keys, so I
> can see what they look like and if they match the osd map contents?
>
> Thanks!
> sage
>
>
> > If trimming/compacting them would help, how would one do that?
> >
> > Thanks,
> > Brian
> >
> > ________________________________
> > The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.
> >
>
> ________________________________
> The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.
>