Hello again,
I have a new question:
We want to upgrade a server, with an os based on rhel6.
The ceph cluster is atm on octopus.
How can I install the client packages to mount cephfs and do a backup of the server?
Is it even possible?
Are the client packages from hammer compatible with the octopus release?
Thanks in advance,
Simon
Hello everyone
I've got a fresh ceph octopus installation and I'm trying to set up a cephfs with erasure code configuration.
The metadata pool was set up as default.
The erasure code pool was set up with this command:
-> ceph osd pool create ec-data_fs 128 erasure default
Enabled overwrites:
-> ceph osd pool set ec-data_fs allow_ec_overwrites true
And create fs:
-> ceph fs new ec-data_fs meta_fs ec-data_fs --force
Then I tried deploying the mds, but this fails:
-> ceph orch daemon add mds ec-data_fs magma01
returns:
-> Deployed mds.ec-data_fs.magma01.ujpcly on host 'magma01'
The mds daemon is not there.
Aparently the container dies without any information, as seen in the journal:
May 25 16:11:56 magma01 podman[9348]: 2020-05-25 16:11:56.670510456 +0200 CEST m=+0.186462913 container create 0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90 (image=docker.io/ceph/ceph:v15, name=competent_cori)
May 25 16:11:56 magma01 systemd[1]: Started libpod-conmon-0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90.scope.
May 25 16:11:56 magma01 systemd[1]: Started libcontainer container 0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90.
May 25 16:11:57 magma01 podman[9348]: 2020-05-25 16:11:57.112182262 +0200 CEST m=+0.628134873 container init 0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90 (image=docker.io/ceph/ceph:v15, name=competent_cori)
May 25 16:11:57 magma01 podman[9348]: 2020-05-25 16:11:57.137011897 +0200 CEST m=+0.652964354 container start 0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90 (image=docker.io/ceph/ceph:v15, name=competent_cori)
May 25 16:11:57 magma01 podman[9348]: 2020-05-25 16:11:57.137110412 +0200 CEST m=+0.653062853 container attach 0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90 (image=docker.io/ceph/ceph:v15, name=competent_cori)
May 25 16:11:57 magma01 systemd[1]: libpod-0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90.scope: Consumed 327ms CPU time
May 25 16:11:57 magma01 podman[9348]: 2020-05-25 16:11:57.182968802 +0200 CEST m=+0.698921275 container died 0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90 (image=docker.io/ceph/ceph:v15, name=competent_cori)
May 25 16:11:57 magma01 podman[9348]: 2020-05-25 16:11:57.413743787 +0200 CEST m=+0.929696266 container remove 0fdf8c508b330adac713ffb04c72b5df770277ad191d844888f7387f28e3cc90 (image=docker.io/ceph/ceph:v15, name=competent_cori)
Can someone help me debugging this?
Cheers
Simon
Hi,
Following on from various woes, we see an odd and unhelpful behaviour with some OSDs on our cluster currently.
A minority of OSDs seem to have runaway memory usage, rising to 10s of GB, whilst other OSDs on the same host behave sensibly. This started when we moved from Mimic -> Nautilus, as far as we can tell.
In best case, this causes some nodes to start swapping [and reduces their performance]. In worst case, it triggers the OOMkiller.
I have dumped the mempool for these OSDs, which shows that almost all the memory is in the buffer_anon pool.
The perf dump shows that the OSD is targetting the 4GB limit that's set for it, but for some reason is unable to reach this due to stuff in the priorty_cache (which seems to be mostly what is filling buffer_anon)
Can anyone advise on what we should do next?
(mempool dump and excerpt of perf dump at end of email).
Thanks for any help,
Sam Skipsey
MEMPOOL DUMP
{
"mempool": {
"by_pool": {
"bloom_filter": {
"items": 0,
"bytes": 0
},
"bluestore_alloc": {
"items": 5629372,
"bytes": 45034976
},
"bluestore_cache_data": {
"items": 127,
"bytes": 65675264
},
"bluestore_cache_onode": {
"items": 8275,
"bytes": 4634000
},
"bluestore_cache_other": {
"items": 2967913,
"bytes": 62469216
},
"bluestore_fsck": {
"items": 0,
"bytes": 0
},
"bluestore_txc": {
"items": 145,
"bytes": 100920
},
"bluestore_writing_deferred": {
"items": 335,
"bytes": 13160884
},
"bluestore_writing": {
"items": 1406,
"bytes": 5379120
},
"bluefs": {
"items": 1105,
"bytes": 24376
},
"buffer_anon": {
"items": 13705143,
"bytes": 40719040439
},
"buffer_meta": {
"items": 6820143,
"bytes": 600172584
},
"osd": {
"items": 96,
"bytes": 1138176
},
"osd_mapbl": {
"items": 59,
"bytes": 7022524
},
"osd_pglog": {
"items": 491049,
"bytes": 156701043
},
"osdmap": {
"items": 107885,
"bytes": 1723616
},
"osdmap_mapping": {
"items": 0,
"bytes": 0
},
"pgmap": {
"items": 0,
"bytes": 0
},
"mds_co": {
"items": 0,
"bytes": 0
},
"unittest_1": {
"items": 0,
"bytes": 0
},
"unittest_2": {
"items": 0,
"bytes": 0
}
},
"total": {
"items": 29733053,
"bytes": 41682277138
}
}
}
PERF DUMP excerpt:
"prioritycache": {
"target_bytes": 4294967296,
"mapped_bytes": 38466584576,
"unmapped_bytes": 425984,
"heap_bytes": 38467010560,
"cache_bytes": 134217728
},
Folks,
I am running into a very strange issue with a brand new Ceph cluster during initial testing. Cluster
consists of 12 nodes, 4 of them have SSDs only, the other eight have a mixture of SSDs and HDDs.
The latter nods are configured so that three or four HDDs use one SSDs for their blockdb.
Ceph version is Nautilus.
When writing to the cluster, clients will, in regular intervals, run into I/O stall (i.e. writes will take up
to 25 minutes to complete). Deleting RBD Images will often take forever as well. After several weeks
of debugging, what I can say from looking at the log files, is that what appears to take a lot of time
is writing stuff to OSDs:
"time": "2020-05-20 10:52:23.211006",
"event": "reached_pg"
},
{
"time": "2020-05-20 10:52:23.211047",
"event": "waiting for ondisk"
},
{
"time": "2020-05-20 10:53:35.369081",
"event": "done"
}
But these machines are I/O idling. there is almost no I/O happening at all according to sysstat.
I am slowly growing a bit desperate over this, and hence I wonder whether anybody has ever
seen a similar issue? Or are there possibly any tips on where to carry on with debugging?
Servers are from Dell with PERC controllers in HBA mode.
The primary purpose of this Ceph cluster is to serve as backing storage for OpenStack, and to
this point, I was not able to reproduce the issue with the SSD-only nodes.
Best regards
Martin
Hi,
I am trying to setup multisite cluster with 2 sites. I created master zonegroup and zone by following the instructions given in the documentation. On the secondary zone cluster I could pull the master zone. I created secondary zone. When I tried to commit the period I am getting the following error.
2020-05-25 16:16:46.054 7f4ad25596c0 1 Cannot find zone id=2f272093-3712-45a7-8a63-b17f12ccd07c (name=testsite2), switching to local zonegroup configuration
Sending period to new master zone 6d8d5ffa-2034-4717-978e-3ab4ba4349c5
request failed: (5) Input/output error
failed to commit period: (5) Input/output error
Can someone please help me to solve this issue.
Regards,
Sailaja
Hi,
The numbers of object counts from "rados df" and "rados ls" are different
in my testing environment. I think it maybe some zero bytes or unclean
objects since I removed all rbd images on top of it few days ago.
How can I make it right / found out where are those ghost objects? Or i
should ignore it since the numbers was not that high.
$ rados -p rbd df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED
RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
rbd 18 MiB 430107 0 1290321 0 0 0
141243877 6.9 TiB 42395431 11 TiB 0 B 0 B
$ rados -p rbd ls | wc -l
4
$ rados -p rbd ls
gateway.conf
rbd_directory
rbd_info
rbd_trash
Regs,
Icy
Hi, I am new on rgw and try deploying a mutisite cluster in order to sync
data from one cluster to another.
My source zone is the default zone in the default zonegroup, structure as
belows:
realm: big-realm
|
zonegroup: default
/ \
master zone: default secondary zone: backup
*STEP*:
on source cluster:
1. radosgw-admin realm create --rgw-realm=big-realm --default
2. radosgw-admin zonegroup modify --rgw-realm big-realm --rgw-zonegroup
default --master --endpoints "http://172.24.29.26:7480"
3. radosgw-admin zone modify --rgw-zonegroup default --rgw-zone default
--master --endpoints "http://172.24.29.26:7480"
4. radosgw-admin user create --uid=sync-user
--display-name="Synchronization User" --access-key=redhat --secret=redhat
--system
5. radosgw-admin zone modify --rgw-zone=default --access-key=redhat
--secret=redhat
6. radosgw-admin period update --commit
on destination cluster:
1. radosgw-admin realm pull --url="http://172.24.29.26:7480
<http://172.24.29.26/>" --access-key=redhat --secret=redhat
--rgw-realm=big-realm
2. radosgw-admin realm default --rgw-realm=big-realm
3. radosgw-admin period pull --url="http://172.24.29.26:7480"
--access-key=redhat --secret=redhat
4. radosgw-admin zonegroup default --rgw-zonegroup=default
5. radosgw-admin zone create --rgw-zonegroup=default --rgw-zone=backup
--endpoints="http://172.24.29.29:7480" --access-key=redhat --secret=redhat
--default
6. radosgw-admin period update --commit
commit period on secondary zone get error:
2020-04-02 14:36:04.707 7fd8ee9376c0 1 Cannot find zone
id=8c75360a-c0cf-4772-b85e-ff74726396c2 (name=backup), switching to local
zonegroup configuration
Sending period to new master zone 5fba7cae-47f1-4c8e-9a34-1b499c9c27f8
request failed: (2202) Unknown error 2202
failed to commit period: (2202) Unknown error 2202
radosgw-admin sync status:
2020-04-02 14:37:18.330 7f27c60676c0 1 Cannot find zone
id=8c75360a-c0cf-4772-b85e-ff74726396c2 (name=backup), switching to local
zonegroup configuration
realm fec73799-36be-4418-abb2-9804cc83d83d (big-realm)
zonegroup fc61ac2f-dc1d-421b-90af-ffe9113b9935 (default)
zone 8c75360a-c0cf-4772-b85e-ff74726396c2 (backup)
metadata sync failed to read sync status: (2) No such file or directory
data sync source: 5fba7cae-47f1-4c8e-9a34-1b499c9c27f8 (default)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
My source cluster's version is 13.2.8, and destination cluster's is 14.2.8.
I tried sync data from cluster of both 13.2.8 version and got the same
error.
Is there any step I was wrong or the default zone cannot be synced?
Thanks
Hi,
I am using ceph Nautilus cluster with below configuration.
3 node's (Ubuntu 18.04) each has 12 OSD's, and mds, mon and mgr are running
in shared mode.
The client mounted through ceph kernel client.
I was trying to emulate a node failure when a write and read were going on
(replica2) pool.
I was expecting read and write continue after a small pause due to a Node
failure but it halts and never resumes until the failed node is up.
I remember I tested the same scenario before in ceph mimic where it
continued IO after a small pause.
regards
Amudhan P