Hi all,
We have a few subdirs with an rctime in the future.
# getfattr -n ceph.dir.rctime session
# file: session
ceph.dir.rctime="2576387188.090"
I can't find any subdir or item in that directory with that rctime, so
I presume that there was previously a file and that rctime cannot go
backwards [1]
Is there any way to fix these rctimes so they show the latest ctime of
the subtree?
Also -- are we still relying on the client clock to set the rctime /
ctime of a file? Would it make sense to limit ctime/rctime for any
update to the current time on the MDS ?
Best Regards,
Dan
[1] https://github.com/ceph/ceph/pull/24023/commits/920ef964311a61fcc6c0d6671b7…
Hi,
All of a sudden, we are experiencing very concerning MON behaviour. We have five MONs and all of them have thousands up to tens of thousands of slow ops, the oldest one blocking basically indefinitely (at least the timer keeps creeping up). Additionally, the MON stores keep inflating heavily. Under normal circumstances we have about 450-550MB there. Right now its 27GB and growing (rapidly).
I tried restarting all MONs, I disabled auto-scaling (just in case) and checked the system load and hardware. I also restarted the MGR and MDS daemons, but to no avail.
Is there any way I can debug this properly? I can’t seem to find how I can actually view what ops are causing this and what client (if any) may be responsible for it.
Thanks
Janek
I have a cephfs secondary (non-root) data pool with unfound and degraded
objects that I have not been able to recover[1]. I created an
additional data pool and used "setfattr -n ceph.dir.layout.pool' and a
very long rsync to move the files off of the degraded pool and onto the
new pool. This has completed, and using find + 'getfattr -n
ceph.file.layout.pool', I verified that no files are using the old pool
anymore. No ceph.dir.layout.pool attributes point to the old pool either.
However, the old pool still reports that there are objects in the old
pool, likely the same ones that were unfound/degraded from before:
https://pastebin.com/qzVA7eZr
Based on a old message from the mailing list[2], I checked the MDS for
stray objects (ceph daemon mds.ceph4 dump cache file.txt ; grep -i stray
file.txt) and found 36 stray entries in the cache:
https://pastebin.com/MHkpw3DV. However, I'm not certain how to map
these stray cache objects to clients that may be accessing them.
'rados -p fs.data.archive.frames ls' shows 145 objects. Looking at the
parent of each object shows 2 strays:
for obj in $(cat rados.ls.txt) ; do echo $obj ; rados -p
fs.data.archive.frames getxattr $obj parent | strings ; done
[...]
10000020fa1.00000000
10000020fa1
stray6
10000020fbc.00000000
10000020fbc
stray6
[...]
...before getting stuck on one object for over 5 minutes (then I gave up):
1000005b1af.00000083
What can I do to make sure this pool is ready to be safely deleted from
cephfs (ceph fs rm_data_pool archive fs.data.archive.frames)?
--Mike
[1]https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/QHFOGEKXK7VDNNSKR74BA6IIMGGIXBXA/#7YQ6SSTESM5LTFVLQK3FSYFW5FDXJ5CF
[2]http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-October/005233.h…
Hi everyone!
I'm excited to announce two talks we have on the schedule for February 2021:
Jason Dillaman will be giving part 2 to the librbd code walk-through.
The stream starts on February 23rd at 18:00 UTC / 19:00 CET / 1:00 PM
EST / 10:00 AM PST
https://tracker.ceph.com/projects/ceph/wiki/Code_Walkthroughs
Part 1: https://www.youtube.com/watch?v=L0x61HpREy4
--------------
What's New in the Pacific Release
Hear Sage Weil give a live update on the development of the Pacific Release.
The stream starts on February 25th at 17:00 UTC / 18:00 CET / 12 PM
EST / 9 AM PST.
https://ceph.io/ceph-tech-talks/
All live streams will be recorded and
--
Mike Perez
Hi everyone!
Be sure to make your voice heard by taking the Ceph User Survey before
April 2, 2021. This information will help guide the Ceph community’s
investment in Ceph and the Ceph community's future development.
https://ceph.io/user-survey/
Thank you to the Ceph User Survey Working Group for designing this
year's survey!
* Anantha Adiga
* Anthony D'Atri
* Paul Mezzanini
* Stefan Kooman
https://tracker.ceph.com/projects/ceph/wiki/User_Survey_Working_Group
We will provide the final results in the coming months after the
survey has ended.
--
Mike Perez
Hi All,
We've been dealing with what seems to be a pretty annoying bug for a while
now. We are unable to delete a customer's bucket that seems to have an
extremely large number of aborted multipart uploads. I've had $(radosgw-admin
bucket rm --bucket=pusulax --purge-objects) running in a screen session for
almost 3 weeks now and it's still not finished; it's most likely stuck in a
loop or something. The screen session with debug-rgw=10 spams billions of
these messages:
2021-02-23 15:38:58.667 7f9b55704840 10
RGWRados::cls_bucket_list_unordered: got
_multipart_04/d3/04d33e18-3f13-433c-b924-56602d702d60-31.msg.2~0DTalUjTHsnIiKraN1klwIFO88Vc2E3.meta[]
2021-02-23 15:38:58.667 7f9b55704840 10
RGWRados::cls_bucket_list_unordered: got
_multipart_04/d7/04d7ad26-c8ec-4a39-9938-329acd6d9da7-102.msg.2~K_gAeTpfEongNvaOMNa0IFwSGPpQ1iA.meta[]
2021-02-23 15:38:58.667 7f9b55704840 10
RGWRados::cls_bucket_list_unordered: got
_multipart_04/da/04da4147-c949-4c3a-aca6-e63298f5ff62-102.msg.2~-hXBSFcjQKbMkiyEqSgLaXMm75qFzEp.meta[]
2021-02-23 15:38:58.667 7f9b55704840 10
RGWRados::cls_bucket_list_unordered: got
_multipart_04/db/04dbb0e6-dfb0-42fb-9d0f-49cceb18457f-102.msg.2~B5EhGgBU5U_U7EA5r8IhVpO3Aj2OvKg.meta[]
2021-02-23 15:38:58.667 7f9b55704840 10
RGWRados::cls_bucket_list_unordered: got
_multipart_04/df/04df39be-06ab-4c72-bc63-3fac1d2700a9-11.msg.2~_8h5fWlkNrIMqcrZgNbAoJfc8BN1Xx-.meta[]
This is probably the 2nd or 3rd time I've been unable to delete this
bucket. I also tried running $(radosgw-admin bucket check --fix
--check-objects --bucket=pusulax) before kicking off the delete job, but
that didn't work either. Here is the bucket in question, the num_objects
counter never decreases after trying to delete the bucket:
[root@os5 ~]# radosgw-admin bucket stats --bucket=pusulax
{
"bucket": "pusulax",
"num_shards": 144,
"tenant": "",
"zonegroup": "dbb69c5b-b33f-4af2-950c-173d695a4d2c",
"placement_rule": "default-placement",
"explicit_placement": {
"data_pool": "",
"data_extra_pool": "",
"index_pool": ""
},
"id": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.3209338.4",
"marker": "c0d0b8a5-c63c-4c24-9dab-8deee88dbf0b.3292800.7",
"index_type": "Normal",
"owner": "REDACTED",
"ver":
"0#115613,1#115196,2#115884,3#115497,4#114649,5#114150,6#116127,7#114269,8#115220,9#115092,10#114003,11#114538,12#115235,13#113463,14#114928,15#115135,16#115535,17#114867,18#116010,19#115766,20#115274,21#114818,22#114805,23#114853,24#114099,25#114359,26#114966,27#115790,28#114572,29#114826,30#114767,31#115614,32#113995,33#115305,34#114227,35#114342,36#114144,37#114704,38#114088,39#114738,40#114133,41#114520,42#114420,43#114168,44#113820,45#115093,46#114788,47#115522,48#114713,49#115315,50#115055,51#114513,52#114086,53#114401,54#114079,55#113649,56#114089,57#114157,58#114064,59#115224,60#114753,61#114686,62#115169,63#114321,64#114949,65#115075,66#115003,67#114993,68#115320,69#114392,70#114893,71#114219,72#114190,73#114868,74#113432,75#114882,76#115300,77#114755,78#114598,79#114221,80#114895,81#114031,82#114566,83#113849,84#115155,85#113790,86#113334,87#113800,88#114856,89#114841,90#115073,91#113849,92#114554,93#114820,94#114256,95#113840,96#114838,97#113784,98#114876,99#115524,100#115686,101#112969,102#112156,103#112635,104#112732,105#112933,106#112412,107#113090,108#112239,109#112697,110#113444,111#111730,112#112446,113#114479,114#113318,115#113032,116#112048,117#112404,118#114545,119#112563,120#112341,121#112518,122#111719,123#112273,124#112014,125#112979,126#112209,127#112830,128#113186,129#112944,130#111991,131#112865,132#112688,133#113819,134#112586,135#113275,136#112172,137#113019,138#112872,139#113130,140#112716,141#112091,142#111859,143#112773",
"master_ver":
"0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0,11#0,12#0,13#0,14#0,15#0,16#0,17#0,18#0,19#0,20#0,21#0,22#0,23#0,24#0,25#0,26#0,27#0,28#0,29#0,30#0,31#0,32#0,33#0,34#0,35#0,36#0,37#0,38#0,39#0,40#0,41#0,42#0,43#0,44#0,45#0,46#0,47#0,48#0,49#0,50#0,51#0,52#0,53#0,54#0,55#0,56#0,57#0,58#0,59#0,60#0,61#0,62#0,63#0,64#0,65#0,66#0,67#0,68#0,69#0,70#0,71#0,72#0,73#0,74#0,75#0,76#0,77#0,78#0,79#0,80#0,81#0,82#0,83#0,84#0,85#0,86#0,87#0,88#0,89#0,90#0,91#0,92#0,93#0,94#0,95#0,96#0,97#0,98#0,99#0,100#0,101#0,102#0,103#0,104#0,105#0,106#0,107#0,108#0,109#0,110#0,111#0,112#0,113#0,114#0,115#0,116#0,117#0,118#0,119#0,120#0,121#0,122#0,123#0,124#0,125#0,126#0,127#0,128#0,129#0,130#0,131#0,132#0,133#0,134#0,135#0,136#0,137#0,138#0,139#0,140#0,141#0,142#0,143#0",
"mtime": "2020-06-17 20:27:16.685833Z",
"max_marker":
"0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#,11#,12#,13#,14#,15#,16#,17#,18#,19#,20#,21#,22#,23#,24#,25#,26#,27#,28#,29#,30#,31#,32#,33#,34#,35#,36#,37#,38#,39#,40#,41#,42#,43#,44#,45#,46#,47#,48#,49#,50#,51#,52#,53#,54#,55#,56#,57#,58#,59#,60#,61#,62#,63#,64#,65#,66#,67#,68#,69#,70#,71#,72#,73#,74#,75#,76#,77#,78#,79#,80#,81#,82#,83#,84#,85#,86#,87#,88#,89#,90#,91#,92#,93#,94#,95#,96#,97#,98#,99#,100#,101#,102#,103#,104#,105#,106#,107#,108#,109#,110#,111#,112#,113#,114#,115#,116#,117#,118#,119#,120#,121#,122#,123#,124#,125#,126#,127#,128#,129#,130#,131#,132#,133#,134#,135#,136#,137#,138#,139#,140#,141#,142#,143#",
"usage": {
"rgw.multimeta": {
"size": 0,
"size_actual": 0,
"size_utilized": 97009704,
"size_kb": 0,
"size_kb_actual": 0,
"size_kb_utilized": 94737,
"num_objects": 3592952
}
},
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
}
We're running 14.2.16 on our RGWs and OSD nodes. Anyone have any ideas? Is
it possible to target this bucket via rados directly to try and delete it?
I'm weary of doing stuff like that though. Thanks in advance.
- Dave Monschein
Hi,
Having been slightly caught out by tunables on my Octopus upgrade[0],
can I just check that if I do
ceph osd crush tunables optimal
That will update the tunables on the cluster to the current "optimal"
values (and move a lot of data around), but that this doesn't mean
they'll change next time I upgrade the cluster or anything like that?
It's not quite clear from the documentation whether the next time
"optimal" tunables change that'll be applied to a cluster where I've set
tunables thus, or if tunables are only ever changed by a fresh
invocation of ceph osd crush tunables...
[I assume the same answer applies to "default"?]
Regards,
Matthew
[0] I foolishly thought a cluster initially installed as Jewel would
have jewel tunables
--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
Hi,
We see that we have 5 'remapped' PGs, but are unclear why/what to do about
it. We shifted some target ratios for the autobalancer and it resulted in
this state. When adjusting ratio, we noticed two OSDs go down, but we just
restarted the container for those OSDs with podman, and they came back up.
Here's status output:
###################
root@ceph01:~# ceph status
INFO:cephadm:Inferring fsid x
INFO:cephadm:Inferring config x
INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
cluster:
id: 41bb9256-c3bf-11ea-85b9-9e07b0435492
health: HEALTH_OK
services:
mon: 5 daemons, quorum ceph01,ceph04,ceph02,ceph03,ceph05 (age 2w)
mgr: ceph03.ytkuyr(active, since 2w), standbys: ceph01.aqkgbl,
ceph02.gcglcg, ceph04.smbdew, ceph05.yropto
osd: 168 osds: 168 up (since 2d), 168 in (since 2d); 5 remapped pgs
data:
pools: 3 pools, 1057 pgs
objects: 18.00M objects, 69 TiB
usage: 119 TiB used, 2.0 PiB / 2.1 PiB avail
pgs: 1056 active+clean
1 active+clean+scrubbing+deep
io:
client: 859 KiB/s rd, 212 MiB/s wr, 644 op/s rd, 391 op/s wr
root@ceph01:~#
###################
When I look at ceph pg dump, I don't see any marked as remapped:
###################
root@ceph01:~# ceph pg dump |grep remapped
INFO:cephadm:Inferring fsid x
INFO:cephadm:Inferring config x
INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15
dumped all
root@ceph01:~#
###################
Any idea what might be going on/how to recover? All OSDs are up. Health is
'OK'. This is Ceph 15.2.4 deployed using Cephadm in containers, on Podman
2.0.3.
I'm new to ceph, and I've been trying to set up a new cluster with 16
computers with 30 disks each and 6 SSD (plus boot disks), 256G of memory,
IB Networking. (ok its currently 15 but never mind)
When I take them over about 10 OSD's each they start having problems
starting the OSD up and I can normally fix this by rebooting them and it
will continue again for a while, and it is possible to get them up to the
full complement with a bit of poking around. (Once its working it fne
unless you start adding services or moving the OSD's around
Is there anything I can change to make it a bit more stable.
I've already set
fs.aio-max-nr = 1048576
kernel.pid_max = 4194303
fs.file-max = 500000
which made it a bit better, but I feel it could be even better.
I'm currently trying to upgrade to 15.2.9 from the default cephadm version
of octopus. The upgrade is going very very slowly. I'm currently using
podman if that helps, I'm not sure if docker would be better? (I've mainly
used singularity when I've handled containers before)
Thanks in advance
Peter Childs