I am running Ceph 15.2.13 on CentOS 7.9.2009 and recently my MDS servers
have started failing with the error message
In function 'void Server::handle_client_open(MDRequestRef&)' thread
7f0ca9908700 time 2021-06-28T09:21:11.484768+0200
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/15.2.13/rpm/el7/BUILD/ceph-15.2.13/src/mds/Server.cc:
4149: FAILED ceph_assert(cur->is_auth())
Complete log is:
https://gist.github.com/pvanheus/4da555a6de6b5fa5e46cbf74f5500fbd
ceph status output is:
# ceph status
cluster:
id: ed7b2c16-b053-45e2-a1fe-bf3474f90508
health: HEALTH_WARN
30 OSD(s) experiencing BlueFS spillover
insufficient standby MDS daemons available
1 MDSs report slow requests
2 mgr modules have failed dependencies
4347046/326505282 objects misplaced (1.331%)
6 nearfull osd(s)
23 pgs not deep-scrubbed in time
23 pgs not scrubbed in time
8 pool(s) nearfull
services:
mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 22m)
mgr: ceph-mon1(active, since 11w), standbys: ceph-mon2, ceph-mon3
mds: SANBI_FS:2 {0=ceph-mon1=up:active(laggy or
crashed),1=ceph-mon2=up:stopping}
osd: 54 osds: 54 up (since 2w), 54 in (since 11w); 50 remapped pgs
data:
pools: 8 pools, 833 pgs
objects: 42.37M objects, 89 TiB
usage: 159 TiB used, 105 TiB / 264 TiB avail
pgs: 4347046/326505282 objects misplaced (1.331%)
782 active+clean
49 active+clean+remapped
1 active+clean+scrubbing+deep
1 active+clean+remapped+scrubbing
io:
client: 29 KiB/s rd, 427 KiB/s wr, 37 op/s rd, 48 op/s wr
When restarting a MDS it goes through states replace, reconnect, resolve
and finally sets itself to active before this crash happens.
Any advice on what to do?
Thanks,
Peter
P.S. apologies if you received this email more than once - I have had some
trouble figuring out the correct mailing list to use.
Hi,
I have setup a ceph cluster with cephadm with docker backend.
I want to move /var/lib/docker to a separate device to get better
performance and less load on the OS device.
I tried that by stopping docker copy the content of /var/lib/docker to
the new device and mount the new device to /var/lib/docker.
The other containers started as expected and continues to work and run
as expected.
But the ceph containers seems to be broken.
I am not able to get them back in working state.
I have tried to remove the host with `ceph orch host rm itcnchn-bb4067`
and readd it but no effect.
The strange thing is that 2 of 4 containers comes up as expected.
ceph orch ps itcnchn-bb4067
NAME HOST STATUS
REFRESHED AGE VERSION IMAGE NAME IMAGE ID
CONTAINER ID
crash.itcnchn-bb4067 itcnchn-bb4067 running (18h) 10m
ago 4w 15.2.7 docker.io/ceph/ceph:v15 2bc420ddb175
2af28c4571cf
mds.cephfs.itcnchn-bb4067.qzoshl itcnchn-bb4067 error 10m
ago 4w <unknown> docker.io/ceph/ceph:v15 <unknown> <unknown>
mon.itcnchn-bb4067 itcnchn-bb4067 error 10m
ago 18h <unknown> docker.io/ceph/ceph:v15 <unknown> <unknown>
rgw.ikea.dc9-1.itcnchn-bb4067.gtqedc itcnchn-bb4067 running (18h) 10m
ago 4w 15.2.7 docker.io/ceph/ceph:v15 2bc420ddb175
00d000aec32b
Docker logs from the active manager does not say much about what is
wrong
debug 2021-01-05T09:57:52.537+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring mds.cephfs.itcnchn-bb4067.qzoshl (unknown last
config time)...
debug 2021-01-05T09:57:52.541+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring daemon mds.cephfs.itcnchn-bb4067.qzoshl on
itcnchn-bb4067
debug 2021-01-05T09:57:52.973+0000 7fdb64e88700 0 log_channel(cluster)
log [DBG] : pgmap v347: 241 pgs: 241 active+clean; 18 GiB data, 50 GiB
used, 52 TiB / 52 TiB avail; 18 KiB/s rd, 78 KiB/s wr, 24 op/s
debug 2021-01-05T09:57:53.085+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring mon.itcnchn-bb4067 (unknown last config
time)...
debug 2021-01-05T09:57:53.085+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring daemon mon.itcnchn-bb4067 on itcnchn-bb4067
debug 2021-01-05T09:57:53.625+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring rgw.ikea.dc9-1.itcnchn-bb4067.gtqedc (unknown
last config time)...
debug 2021-01-05T09:57:53.629+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring daemon rgw.ikea.dc9-1.itcnchn-bb4067.gtqedc on
itcnchn-bb4067
debug 2021-01-05T09:57:54.141+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring crash.itcnchn-bb4067 (unknown last config
time)...
debug 2021-01-05T09:57:54.141+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring daemon crash.itcnchn-bb4067 on itcnchn-bb4067
- Karsten
Has anybody run into a 'stuck' OSD service specification? I've tried
to delete it, but it's stuck in 'deleting' state, and has been for
quite some time (even prior to upgrade, on 15.2.x). This is on 16.2.3:
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
osd.osd_spec 504/525 <deleting> 12m label:osd
root@ceph01:/# ceph orch rm osd.osd_spec
Removed service osd.osd_spec
From active monitor:
debug 2021-05-06T23:14:48.909+0000 7f17d310b700 0
log_channel(cephadm) log [INF] : Remove service osd.osd_spec
Yet in ls, it's still there, same as above. --export on it:
root@ceph01:/# ceph orch ls osd.osd_spec --export
service_type: osd
service_id: osd_spec
service_name: osd.osd_spec
placement: {}
unmanaged: true
spec:
filter_logic: AND
objectstore: bluestore
We've tried --force, as well, with no luck.
To be clear, the --export even prior to delete looks nothing like the
actual service specification we're using, even after I re-apply it, so
something seems 'bugged'. Here's the OSD specification we're applying:
service_type: osd
service_id: osd_spec
placement:
label: "osd"
data_devices:
rotational: 1
db_devices:
rotational: 0
db_slots: 12
I would appreciate any insight into how to clear this up (without
removing the actual OSDs, we're just wanting to apply the updated
service specification - we used to use host placement rules and are
switching to label-based).
Thanks,
David
Dear cephers,
I have a strange problem. An OSD went down and recovery finished. For some reason, I have a slow ops warning for the failed OSD stuck in the system:
health: HEALTH_WARN
430 slow ops, oldest one blocked for 36 sec, osd.580 has slow ops
The OSD is auto-out:
| 580 | ceph-22 | 0 | 0 | 0 | 0 | 0 | 0 | autoout,exists |
It is probably a warning dating back to just before the fail. How can I clear it?
Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Hi,
I’m continuously getting scrub errors in my index pool and log pool that I need to repair always.
HEALTH_ERR 2 scrub errors; Possible data damage: 1 pg inconsistent
[ERR] OSD_SCRUB_ERRORS: 2 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 20.19 is active+clean+inconsistent, acting [39,41,37]
Why is this?
I have no cue at all, no log entry no anything ☹
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
Hi,
We've done our fair share of Ceph cluster upgrades since Hammer, and
have not seen much problems with them. I'm now at the point that I have
to upgrade a rather large cluster running Luminous and I would like to
hear from other users if they have experiences with issues I can expect
so that I can anticipate on them beforehand.
As said, the cluster is running Luminous (12.2.13) and has the following
services active:
services:
mon: 3 daemons, quorum osdnode01,osdnode02,osdnode04
mgr: osdnode01(active), standbys: osdnode02, osdnode03
mds: pmrb-3/3/3 up {0=osdnode06=up:active,1=osdnode08=up:active,2=osdnode07=up:active}, 1 up:standby
osd: 116 osds: 116 up, 116 in;
rgw: 3 daemons active
Of the OSD's, we have 11 SSD's and 105 HDD. The capacity of the cluster
is 1.01PiB.
We have 2 active crush-rules on 18 pools. All pools have a size of 3 there is a total of 5760 pgs.
{
"rule_id": 1,
"rule_name": "hdd-data",
"ruleset": 1,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -10,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 2,
"rule_name": "ssd-data",
"ruleset": 2,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -21,
"item_name": "default~ssd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
rbd -> crush_rule: hdd-data
.rgw.root -> crush_rule: hdd-data
default.rgw.control -> crush_rule: hdd-data
default.rgw.data.root -> crush_rule: ssd-data
default.rgw.gc -> crush_rule: ssd-data
default.rgw.log -> crush_rule: ssd-data
default.rgw.users.uid -> crush_rule: hdd-data
default.rgw.usage -> crush_rule: ssd-data
default.rgw.users.email -> crush_rule: hdd-data
default.rgw.users.keys -> crush_rule: hdd-data
default.rgw.meta -> crush_rule: hdd-data
default.rgw.buckets.index -> crush_rule: ssd-data
default.rgw.buckets.data -> crush_rule: hdd-data
default.rgw.users.swift -> crush_rule: hdd-data
default.rgw.buckets.non-ec -> crush_rule: ssd-data
DB0475 -> crush_rule: hdd-data
cephfs_pmrb_data -> crush_rule: hdd-data
cephfs_pmrb_metadata -> crush_rule: ssd-data
All but four clients are running Luminous, the four are running Jewel
(that needs upgrading before proceeding with this upgrade).
So, normally, I would 'just' upgrade all Ceph packages on the
monitor-nodes and restart mons and then mgrs.
After that, I would upgrade all Ceph packages on the OSD nodes and
restart all the OSD's. Then, after that, the MDSes and RGWs. Restarting
the OSD's will probably take a while.
If anyone has a hint on what I should expect to cause some extra load or
waiting time, that would be great.
Obviously, we have read
https://ceph.com/releases/v14-2-0-nautilus-released/ , but I'm looking
for real world experiences.
Thanks!
--
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | info(a)tuxis.nl
Hey ceph-users,
I setup a multisite sync between two freshly setup Octopus clusters.
In the first cluster I created a bucket with some data just to test the
replication of actual data later.
I then followed the instructions on
https://docs.ceph.com/en/octopus/radosgw/multisite/#migrating-a-single-site…
to add a second zone.
Things went well and both zones are now happily reaching each other and
the API endpoints are talking.
Also the metadata is in sync already - both sides are happy and I can
see bucket listings and users are "in sync":
> # radosgw-admin sync status
> realm 13d1b8cb-dc76-4aed-8578-2ce5d3d010e8 (obst)
> zonegroup 17a06c15-2665-484e-8c61-cbbb806e11d2 (obst-fra)
> zone 6d2c1275-527e-432f-a57a-9614930deb61 (obst-rgn)
> metadata sync no sync (zone is master)
> data sync source: c07447eb-f93a-4d8f-bf7a-e52fade399f3 (obst-az1)
> init
> full sync: 128/128 shards
> full sync: 0 buckets to sync
> incremental sync: 0/128 shards
> data is behind on 128 shards
> behind shards: [0...127]
>
and on the other side ...
> # radosgw-admin sync status
> realm 13d1b8cb-dc76-4aed-8578-2ce5d3d010e8 (obst)
> zonegroup 17a06c15-2665-484e-8c61-cbbb806e11d2 (obst-fra)
> zone c07447eb-f93a-4d8f-bf7a-e52fade399f3 (obst-az1)
> metadata sync syncing
> full sync: 0/64 shards
> incremental sync: 64/64 shards
> metadata is caught up with master
> data sync source: 6d2c1275-527e-432f-a57a-9614930deb61 (obst-rgn)
> init
> full sync: 128/128 shards
> full sync: 0 buckets to sync
> incremental sync: 0/128 shards
> data is behind on 128 shards
> behind shards: [0...127]
>
also the newly created buckets (read: their metadata) is synced.
What is apparently not working in the sync of actual data.
Upon startup the radosgw on the second site shows:
> 2021-06-25T16:15:06.445+0000 7fe71eff5700 1 RGW-SYNC:meta: start
> 2021-06-25T16:15:06.445+0000 7fe71eff5700 1 RGW-SYNC:meta: realm
> epoch=2 period id=f4553d7c-5cc5-4759-9253-9a22b051e736
> 2021-06-25T16:15:11.525+0000 7fe71dff3700 0
> RGW-SYNC:data:sync:init_data_sync_status: ERROR: failed to read remote
> data log shards
>
also when issuing
# radosgw-admin data sync init --source-zone obst-rgn
it throws
> 2021-06-25T16:20:29.167+0000 7f87c2aec080 0
> RGW-SYNC:data:init_data_sync_status: ERROR: failed to read remote data
> log shards
Does anybody have any hints on where to look for what could be broken here?
Thanks a bunch,
Regards
Christian