Hello,
After upgrade our ceph cluster to octopus few days ago we are seeing vms
crashes with below error. We are using ceph with openstack(rocky).
Everything running ubuntu 18.04 with kernel 5.3. We seeing this crashes in
busy vms. this is cluster was upgraded from nautilus.
kernel: [430751.176904] fn-radosclient[3905]: segfault at da0801 ip
00007fe78e076686 sp 00007fe7697f9470 erro
r 4 in librbd.so.1.12.0[7fe78de73000+5cb000]
Apr 6 03:26:00 compute6 kernel: [430751.176922] Code: 00 64 48 8b 04
25 28 00 00 00 48 89 44 24 18 31 c0 48 85 db 0f 84 fa 00 00 00 8
0 bf 38 01 00 00 00 48 89 fd 0f 84 ea 00 00 00 <83> bb 20 3f 00 00 ff
0f 84 dd 00 00 00 48 8b 83 18 3f 00 00 48 8d
Apr 6 03:26:11 compute6 libvirtd[1671]: 2020-04-06 03:26:11.955+0000:
1671: error : qemuMonitorIO:719 : internal error: End of file f
rom qemu monitor
Hi there
I have a fairly simple ceph multisite configuration with 2 ceph clusters
in 2 different datacenters in the same city
The rgws have this config for ssl:
rgw_frontends = civetweb port=7480+443s
ssl_certificate=/opt/ssl/ceph-bundle.pem
The certificate is a real issued certificate, not self signed
I configured the multisite with the guide from
https://docs.ceph.com/docs/nautilus/radosgw/multisite/
More or less ok so far, some learning curve but that's ok
I can access and upload to buckets at both endpoints with s3 client
using https - https://ceph01cs1.domain.com and
https://ceph01cs2.domain.com - all good
Now the problem seems to be when my zones in the zonegroup use https
endpoints, e.g.
{
"id": "4c6774fb-01eb-41fe-a74a-c2693f8e69fc",
"name": "eu",
"api_name": "eu",
"is_master": "true",
"endpoints": [
"https://ceph01cs1.domain.com:443"
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "0c203df2-6f31-4ad1-a899-91f85bf34c4e",
"zones": [
{
"id": "0c203df2-6f31-4ad1-a899-91f85bf34c4e",
"name": "ceph01cs1",
"endpoints": [
"https://ceph01cs1.domain.com:443"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
},
{
"id": "fec1fec8-a3c1-454d-8ed2-2c1da45f9c33",
"name": "ceph01cs2",
"endpoints": [
"https://ceph01cs2.domain.com:443"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "08921dd5-1523-41b6-908f-2f58aa38c969"
}
Meta syncs ok - buckets and users get created, but data doesn't, and
period can be commited and appears on both clusters
I can also curl between the two clusters over 443
However, data sync gets stuck on 'init':
realm 08921dd5-1523-41b6-908f-2f58aa38c969 (world)
zonegroup 4c6774fb-01eb-41fe-a74a-c2693f8e69fc (eu)
zone 0c203df2-6f31-4ad1-a899-91f85bf34c4e (ceph01cs2)
metadata sync no sync (zone is master)
data sync source: fec1fec8-a3c1-454d-8ed2-2c1da45f9c33 (ceph01cs1)
init
full sync: 128/128 shards
full sync: 0 buckets to sync
incremental sync: 0/128 shards
data is behind on 128 shards
behind shards:
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127]
I find errors like:
2020-03-31 20:27:11.372 7f60c84e1700 0 RGW-SYNC:data:sync: ERROR:
failed to init sync, retcode=-16
2020-03-31 20:27:29.548 7f60c84e1700 0
RGW-SYNC:data:sync:init_data_sync_status: ERROR: failed to read remote
data log shards
2020-03-31 20:29:48.499 7f60c94e3700 0 RGW-SYNC:meta: ERROR: failed to
fetch all metadata keys
If I change the endpoints in the zonegroup to plain http, e.g.
http://ceph01cs1.domain.com:7480 and http://ceph01cs2.domain.com:7480
then sync starts!
So my question, and I couldn't find any examples of people using https
to sync.. are https endpoints supported with multisite? and why would
meta work over https but not data?
Many thanks
Richard
Hi,
I want to try using object storage with java.
Is it possible to set up osds with "only" directories as data destination
(using cephadmin) , instead of whole disks? I have read through much of the
docu but didn't found how to do it (if it's possible).
Thanks
Michael
Hello,
I am seeing some commands running on CephFS mounts getting stuck in an uninterruptible sleep, at which point I can only terminate them by rebooting the client. Has anyone experienced anything similar and found a way to safe-guard against this?
My mount is using the ceph kernel driver, with the following config in fstab: 10.225.44.236,10.225.44.237,10.225.44.238:6789:/albacore/system/deploy on /opt/dcl/deploy type ceph (rw,noatime,name=albacore,secret=<hidden>,acl,wsize=32768,rsize=32768,_netdev)
The vast majority of commands complete successfully on the mounted filesystem but on one occasion a "chmod -R +r *" command hung indefinitely (despite having run successfully numerous times before). Attempts to terminate the process using `kill` fail. Repeated attempts to run the same command also get blocked in the same state. A `ps` command shows the processes are stuck in uninterruptable sleep:
[root@svr01 albacore] ~> ps -Al | grep chmod
4 D 0 18657 18656 0 80 0 - 26998 rwsem_ pts/2 00:00:00 chmod
4 D 0 21835 1 0 80 0 - 26998 rwsem_ ? 00:00:00 chmod
Ceph seems to be unaware of the hung process. There are no slow ops / ops in flight in either of the dump_ops_in_flight output on the server, or under sys/kernel/debug/ceph/ on the client. Similarly there are no logs in dmesg for the command / process. Ceph health reports no MDS issues, and there's nothing in the logs from my MDS from when the processes hung.
The only method I've found of clearing the processes is to reboot my client.
Has anyone got experience with this? Are there ceph mount options that would guard against this?
Some details of the current setup:
• ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable)
• We're using the ceph kernel driver, kernel: 5.5.7-1.el7.elrepo.x86_64
• The client server has 38 separate directories mounted, all from the same CephFS filesystem.
• All 38 directories are mounted with the same config by three separate clients.
• Mount config (in fstab): 10.225.44.236,10.225.44.237,10.225.44.238:6789:/albacore/system/deploy on /opt/dcl/deploy type ceph (rw,noatime,name=albacore,secret=<hidden>,acl,wsize=32768,rsize=32768,_netdev)
Kind regards,
Dave
Hi:
I use image map to mount a ceph device to localhost,but I found that,the device name I can not control.
This is a problem,when it's name changed,the FS on it or the database using this device may get wrongs.
Can I control the device name?
For exapme:
[root@gate2 ~]# rbd showmapped
id pool namespace image snap device
0 testpool test_img - /dev/rbd0
Can I map the device like this,just directly give the deveice name?like this:
rbd map testpool/test_img /dev/rbd0
sz_cuitao(a)163.com
Hi all,
I have a Ceph cluster with ~ 70 OSDs of different sizes running on Mimic
. I'm using ceph-deploy for managing the cluster size.
I have to remove some smaller drives and replace them with bigger
drives. From your experience, are the removing an OSD guidelines from
Mimic docs accurate ? I know that there were some changes from the
older versions and I want to avoid any confusions.
I'm talking about the following procedure :
- take the OSD out of the cluster with "ceph osd out osd_no"
- stopping the osd daemon if it's still running
- purging the OSD from the cluster map running from the ceph-deploy host
(or one of the mons ?) :
ceph osd purge {id} --yes-i-really-mean-it
I don't have any specific entries for these OSDs in ceph.conf so I guess
I shouldn't change any conf file.
Thank you.
This is the eighth update to the Ceph Nautilus release series. This release
fixes issues across a range of subsystems. We recommend that all users upgrade
to this release. Please note the following important changes in this
release; as always the full changelog is posted at:
https://ceph.io/releases/v14-2-8-nautilus-released
Notable Changes
---------------
* The default value of `bluestore_min_alloc_size_ssd` has been changed
to 4K to improve performance across all workloads.
* The following OSD memory config options related to bluestore cache autotuning can now
be configured during runtime:
- osd_memory_base (default: 768 MB)
- osd_memory_cache_min (default: 128 MB)
- osd_memory_expected_fragmentation (default: 0.15)
- osd_memory_target (default: 4 GB)
The above options can be set with::
ceph config set osd <option> <value>
* The MGR now accepts `profile rbd` and `profile rbd-read-only` user caps.
These caps can be used to provide users access to MGR-based RBD functionality
such as `rbd perf image iostat` an `rbd perf image iotop`.
* The configuration value `osd_calc_pg_upmaps_max_stddev` used for upmap
balancing has been removed. Instead use the mgr balancer config
`upmap_max_deviation` which now is an integer number of PGs of deviation
from the target PGs per OSD. This can be set with a command like
`ceph config set mgr mgr/balancer/upmap_max_deviation 2`. The default
`upmap_max_deviation` is 1. There are situations where crush rules
would not allow a pool to ever have completely balanced PGs. For example, if
crush requires 1 replica on each of 3 racks, but there are fewer OSDs in 1 of
the racks. In those cases, the configuration value can be increased.
* RGW: a mismatch between the bucket notification documentation and the actual
message format was fixed. This means that any endpoints receiving bucket
notification, will now receive the same notifications inside a JSON array
named 'Records'. Note that this does not affect pulling bucket notification
from a subscription in a 'pubsub' zone, as these are already wrapped inside
that array.
* CephFS: multiple active MDS forward scrub is now rejected. Scrub currently
only is permitted on a file system with a single rank. Reduce the ranks to one
via `ceph fs set <fs_name> max_mds 1`.
* Ceph now refuses to create a file system with a default EC data pool. For
further explanation, see:
https://docs.ceph.com/docs/nautilus/cephfs/createfs/#creating-pools
* Ceph will now issue a health warning if a RADOS pool has a `pg_num`
value that is not a power of two. This can be fixed by adjusting
the pool to a nearby power of two::
ceph osd pool set <pool-name> pg_num <new-pg-num>
Alternatively, the warning can be silenced with::
ceph config set global mon_warn_on_pool_pg_num_not_power_of_two false
Getting Ceph
------------
* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-14.2.8.tar.gz
* For packages, see http://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: 2d095e947a02261ce61424021bb43bd3022d35cb
--
Abhishek Lekshmanan
SUSE Software Solutions Germany GmbH
GF: Felix Imendörffer HRB 21284 (AG Nürnberg)