April 2020 - ceph-users - lists.ceph.io

Balancer not balancing (14.2.7, crush-compat)

by Vladimir Brik

Hello I am running ceph 14.2.7 with balancer in crush-compat mode (needed because of old clients), but it's doesn't seem to be doing anything. It used to work in the past. I am not sure what changed. I created a big pool, ~285TB stored, and it doesn't look like it ever got balanced: pool 43 'fs-data-k5m2-hdd' erasure size 7 min_size 6 crush_rule 7 object_hash rjenkins pg_num 2048 pgp_num 2048 autoscale_mode warn last_change 48647 lfor 0/42080/42102 flags hashpspool,ec_overwrites,nearfull stripe_width 20480 application cephfs OSD utilization varies between ~50% and about ~80%, with about 60% raw used. I am using a mixture of 9TB and 14TB drives. Number of PGs/drive varies 103 and 207. # ceph osd df | grep hdd | sort -k 17 | (head -n 2; tail -n 2) 160 hdd 12.53519 1.00000 13 TiB 6.0 TiB 5.9 TiB 74 KiB 12 GiB 6.6 TiB 47.74 0.79 120 up 146 hdd 12.53519 1.00000 13 TiB 6.0 TiB 6.0 TiB 51 MiB 13 GiB 6.5 TiB 48.17 0.80 119 up 79 hdd 8.99799 1.00000 9.0 TiB 7.3 TiB 7.2 TiB 42 KiB 16 GiB 1.7 TiB 80.91 1.34 186 up 62 hdd 8.99799 1.00000 9.0 TiB 7.3 TiB 7.2 TiB 112 KiB 16 GiB 1.7 TiB 81.44 1.35 189 up # ceph balancer status { "last_optimize_duration": "0:00:00.339635", "plans": [], "mode": "crush-compat", "active": true, "optimize_result": "Some osds belong to multiple subtrees: {0: ['default', 'default~hdd'], ... "last_optimize_started": "Thu Apr 9 11:17:40 2020" } Does anybody know how to debug this? Thanks, Vlad

4 years

1
1
0 0

Bucket index entries containing unicode NULL - how to remove them?

by Maks Kowalik

Hello, since some time I've been investigating problems causing buckets' index corruption. In my case it's been because of numerous bugs related to index resharding and bucket lifecycle policies. One of those bugs present in versions prior to 14.2.8 made the index omapkeys' names contain unicode NULL which manifests like this: root@mach0122:~/mkw # radosgw-admin bi list --bucket=sysa-user-logs|grep idx |tail -5 "idx": "_multipart_user-ec24b85efa1c/user-ec24b85efa1c-AA-U79.log-2020031521.gz.u6_qPM0X2mXQRXD0hEfRK2dCc4dD4El\u0000.4", "idx": "_multipart_user-ec24b85efa1c/user-ec24b85efa1c-AA-U79.log-2020031600.gz.bvZdIu5KnsAQyXoa_ZWpix-BYD-yvcz\u0000.15", "idx": "_multipart_user-ec24b85efa1c/user-ec24b85efa1c-AA-U79.log-2020031600.gz.wX9jHyKnJ21w5DYL_rdfTsb7zH3tUEa\u0000.10", "idx": "_multipart_user-ec24b85efa1c/user-ec24b85efa1c-AA-U79.log-2020031601.gz.K6GhZfNzcqDCcjkrn5GskRWS8ufXSbO\u0000.8", "idx": "_multipart_user-ec24b85efa1c/user-ec24b85efa1c-AA-U79.log-2020031601.gz.vEr4VCu0QU07te_RHNJ9Wi1cb8tsiYq\u0000.7", I am unable to remove them by rmomapkey. Also I tried to use the method https://github.com/ceph/ceph/blob/8c63b26fe88bb02d894705cb1beec289668fb43d/… but it fails with: File "./idx_removal.py", line 96, in remove_key iocontext.remove_omap_keys(write_op, key) File "rados.pyx", line 516, in rados.requires.wrapper.validate_func (/build/ceph-14.2.8/obj-x86_64-linux-gnu/src/pybind/rados/pyrex/rados.c:4992) File "rados.pyx", line 3607, in rados.Ioctx.remove_omap_keys (/build/ceph-14.2.8/obj-x86_64-linux-gnu/src/pybind/rados/pyrex/rados.c:45059) File "rados.pyx", line 543, in rados.cstr_list (/build/ceph-14.2.8/obj-x86_64-linux-gnu/src/pybind/rados/pyrex/rados.c:5665) File "rados.pyx", line 539, in rados.cstr (/build/ceph-14.2.8/obj-x86_64-linux-gnu/src/pybind/rados/pyrex/rados.c:5463) TypeError: keys must be a string Does anyone know what else can be done to remove such omapkey? For the obvious reasons I can't use the clear_omap method... Kind regards, Maks Kowalik

4 years

1
0
0 0

Using M2 SSDs as osds

by Stolte, Felix

Hey guys, I am evaluating using M2 SSDs as osds for an all flash pool. Is anyone using that in production and can elaborate on his experience? I am a little bit concerned about the lifetime of the M2 disks. Best regards Felix IT-Services Telefon 02461 61-9243 E-Mail: f.stolte(a)fz-juelich.de ------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------- Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt ------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------

4 years

2
1
0 0

Nautilus 14.2.7 radosgw lifecycle not removing expired objects

by oneill.gs＠gmail.com

Hello, I'm running a cluster with Ceph version 14.2.7 (3d58626ebeec02d8385a4cefb92c6cbc3a45bfe8) nautilus (stable). I've encountered an issue with my cluster where objects are marked as expired but are not removed during lifecycle processing. These buckets have a mix of objects with and without special characters in their names. I set debug_rgw to 20/5 overnight so I could get more detailed logs when the lifecycle processing happened. The lifecycle function seems to correctly identify expired objects that should be removed, however they are not deleted. This is happening across multiple buckets and users. This cluster does have dynamic bucket index resharding enabled, however this issue affects buckets regardless of their reshard status. I also have cache tiering set to readproxy for the buckets data pool. root@rgw01 # ceph daemon /var/run/ceph/ceph-rgw-rgw01.asok config show | grep -E "lifecycle|lc" "rgw_lifecycle_work_time": "00:00-06:00", "rgw_enable_lc_threads": "true", "rgw_lc_debug_interval": "-1", "rgw_lc_lock_max_time": "60", "rgw_lc_max_objs": "32", "rgw_lc_max_rules": "1000", "rgw_lc_thread_delay": "0", root@mon01 # radosgw-admin lc list [ { "bucket": ":testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50", "status": "COMPLETE" } ] root@mon01 # radosgw-admin lc get --bucket testgso { "prefix_map": { "": { "status": true, "dm_expiration": false, "expiration": 1, "noncur_expiration": 0, "mp_expiration": 1 } }, "rule_map": [ { "id": "111222333", "rule": { "id": "111222333", "prefix": "", "status": "Enabled", "expiration": { "days": "1", "date": "" }, "noncur_expiration": { "days": "", "date": "" }, "mp_expiration": { "days": "1", "date": "" }, "filter": { "prefix": "", "obj_tags": {} }, "dm_expiration": false } } ] } goneill@mbp $ s3cmd -c Documents/s3cfg/testgso.cfg ls s3://testgso DIR s3://testgso/lifecycle/ 2020-03-31 17:26 12 s3://testgso/file!test.txt 2020-03-31 17:29 12 s3://testgso/file$test.txt 2020-03-31 17:29 12 s3://testgso/file&test.txt 2020-03-31 17:26 12 s3://testgso/file'test.txt 2020-03-31 17:26 12 s3://testgso/file(test.txt 2020-03-31 17:26 12 s3://testgso/file)test.txt 2020-03-31 17:29 12 s3://testgso/file*test.txt 2020-03-31 17:30 12 s3://testgso/file+test.txt 2020-03-31 17:30 12 s3://testgso/file,test.txt 2020-03-31 17:26 12 s3://testgso/file-test.txt 2020-03-31 17:30 12 s3://testgso/file:test.txt 2020-03-31 17:30 12 s3://testgso/file;test.txt 2020-03-31 17:30 12 s3://testgso/file=test.txt 2020-03-31 17:30 12 s3://testgso/file?test.txt 2020-03-31 17:06 12 s3://testgso/file@test.txt 2019-08-05 21:02 12 s3://testgso/test.txt goneill@mbp $ s3cmd -c Documents/s3cfg/testgso.cfg getlifecycle s3://testgso <?xml version="1.0" ?> <LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/"> <Rule> <ID>111222333</ID> <Prefix/> <Status>Enabled</Status> <Expiration> <Days>1</Days> </Expiration> <AbortIncompleteMultipartUpload> <DaysAfterInitiation>1</DaysAfterInitiation> </AbortIncompleteMultipartUpload> </Rule> </LifecycleConfiguration> ### Logs from rgw01 2020-04-08 00:00:00.645 7fc55bbcc700 20 get_system_obj_state: rctx=0x7fc55bbc9bf0 obj=default.rgw.meta:root:testgso state=0x561a8bc0ac40 s->prefetch_data=0 2020-04-08 00:00:00.645 7fc55bbcc700 10 cache get: name=default.rgw.meta+root+testgso : expiry miss 2020-04-08 00:00:00.645 7fc55bbcc700 10 cache put: name=default.rgw.meta+root+testgso info.flags=0x16 2020-04-08 00:00:00.645 7fc55bbcc700 10 adding default.rgw.meta+root+testgso to cache LRU end 2020-04-08 00:00:00.645 7fc55bbcc700 10 updating xattr: name=ceph.objclass.version bl.length()=42 2020-04-08 00:00:00.645 7fc55bbcc700 20 get_system_obj_state: s->obj_tag was set empty 2020-04-08 00:00:00.645 7fc55bbcc700 20 Read xattr: user.rgw.idtag 2020-04-08 00:00:00.645 7fc55bbcc700 10 cache get: name=default.rgw.meta+root+testgso : type miss (requested=0x13, cached=0x16) 2020-04-08 00:00:00.645 7fc55bbcc700 20 rados->read ofs=0 len=0 2020-04-08 00:00:00.646 7fc55bbcc700 20 rados_obj.operate() r=0 bl.length=181 2020-04-08 00:00:00.646 7fc55bbcc700 10 cache put: name=default.rgw.meta+root+testgso info.flags=0x13 2020-04-08 00:00:00.646 7fc55bbcc700 10 moving default.rgw.meta+root+testgso to cache LRU end 2020-04-08 00:00:00.646 7fc55bbcc700 10 updating xattr: name=ceph.objclass.version bl.length()=42 2020-04-08 00:00:00.646 7fc55bbcc700 20 rgw_get_bucket_info: bucket instance: testgso[0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50] 2020-04-08 00:00:00.646 7fc55bbcc700 20 reading from default.rgw.meta:root:.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 2020-04-08 00:00:00.646 7fc55bbcc700 20 get_system_obj_state: rctx=0x7fc55bbc9bf0 obj=default.rgw.meta:root:.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 state=0x561a9d27d540 s->prefetch_data=0 2020-04-08 00:00:00.646 7fc55bbcc700 10 cache get: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 : expiry miss 2020-04-08 00:00:00.646 7fc55bbcc700 10 cache put: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 info.flags=0x16 2020-04-08 00:00:00.646 7fc55bbcc700 10 adding default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 to cache LRU end 2020-04-08 00:00:00.646 7fc55bbcc700 10 updating xattr: name=ceph.objclass.version bl.length()=42 2020-04-08 00:00:00.646 7fc55bbcc700 10 updating xattr: name=user.rgw.acl bl.length()=153 2020-04-08 00:00:00.646 7fc55bbcc700 10 updating xattr: name=user.rgw.lc bl.length()=130 2020-04-08 00:00:00.646 7fc55bbcc700 20 get_system_obj_state: s->obj_tag was set empty 2020-04-08 00:00:00.646 7fc55bbcc700 20 Read xattr: user.rgw.acl 2020-04-08 00:00:00.646 7fc55bbcc700 20 Read xattr: user.rgw.idtag 2020-04-08 00:00:00.646 7fc55bbcc700 20 Read xattr: user.rgw.lc 2020-04-08 00:00:00.646 7fc55bbcc700 10 cache get: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 : type miss (requested=0x13, cached=0x16) 2020-04-08 00:00:00.646 7fc55bbcc700 20 rados->read ofs=0 len=0 2020-04-08 00:00:00.646 7fc55bbcc700 20 rados_obj.operate() r=0 bl.length=281 2020-04-08 00:00:00.646 7fc55bbcc700 10 cache put: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 info.flags=0x13 2020-04-08 00:00:00.646 7fc55bbcc700 10 moving default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 to cache LRU end 2020-04-08 00:00:00.646 7fc55bbcc700 10 updating xattr: name=ceph.objclass.version bl.length()=42 2020-04-08 00:00:00.646 7fc55bbcc700 10 updating xattr: name=user.rgw.acl bl.length()=153 2020-04-08 00:00:00.646 7fc55bbcc700 10 updating xattr: name=user.rgw.lc bl.length()=130 2020-04-08 00:00:00.646 7fc55bbcc700 10 chain_cache_entry: cache_locator=default.rgw.meta+root+testgso 2020-04-08 00:00:00.646 7fc55bbcc700 10 chain_cache_entry: cache_locator=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 2020-04-08 00:00:00.647 7fc55bbcc700 10 lifecycle: bucket_lc_process() prefix_map size=1 2020-04-08 00:00:00.647 7fc55bbcc700 20 lifecycle: bucket_lc_process(): prefix= 2020-04-08 00:00:00.647 7fc55bbcc700 10 cls_bucket_list_unordered testgso[0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50] start [] num_entries 1100 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file!test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file$test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file&test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file'test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file(test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file)test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file*test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file+test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file,test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file-test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file:test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file;test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file=test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file?test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got file(a)test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got lifecycle/file.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 10 RGWRados::cls_bucket_list_unordered: got test.txt[] 2020-04-08 00:00:00.647 7fc55bbcc700 20 lifecycle: bucket_lc_process(): key=file!test.txt 2020-04-08 00:00:00.648 7fc55bbcc700 20 obj_has_expired(): mtime=2020-03-31 13:26:07.0.481384s days=1 base_time=2020-04-08 00:00:00.000000 timediff=642833 cmp=86400 2020-04-08 00:00:00.648 7fc55bbcc700 20 check(): key=file!test.txt: is_expired=1 2020-04-08 00:00:00.648 7fc55bbcc700 20 get_obj_state: rctx=0x7fc55bbc8ff8 obj=testgso:file!test.txt state=0x561aa0c64aa0 s->prefetch_data=0 2020-04-08 00:00:00.657 7fc44f1b3700 10 manifest: total_size = 49667 2020-04-08 00:00:00.657 7fc44f1b3700 20 get_obj_state: setting s->obj_tag to 0a6e5079-ad65-4ea7-a868-b2133099e0d0.22351662.19462946 2020-04-08 00:00:00.672 7fc55bbcc700 20 get_obj_state: setting s->obj_tag to 0a6e5079-ad65-4ea7-a868-b2133099e0d0.25454121.7727201 2020-04-08 00:00:00.672 7fc55bbcc700 10 If-UnModified-Since: 2020-03-31 13:26:07.0.481384s Last-Modified: 2020-03-31 13:26:07.000000 2020-04-08 00:00:00.672 7fc55bbcc700 20 get_obj_state: rctx=0x7fc55bbc8ff8 obj=testgso:file!test.txt state=0x561aa0c64aa0 s->prefetch_data=0 2020-04-08 00:00:00.672 7fc55bbcc700 20 prepare_atomic_modification: state is not atomic. state=0x561aa0c64aa0 2020-04-08 00:00:00.672 7fc55bbcc700 20 reading from default.rgw.meta:root:.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 2020-04-08 00:00:00.672 7fc55bbcc700 20 get_system_obj_state: rctx=0x7fc55bbc8330 obj=default.rgw.meta:root:.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 state=0x561a9d27d0c0 s->prefetch_data=0 2020-04-08 00:00:00.672 7fc55bbcc700 10 cache get: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 : type miss (requested=0x16, cached=0x13) 2020-04-08 00:00:00.672 7fc55bbcc700 10 cache put: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 info.flags=0x16 2020-04-08 00:00:00.672 7fc55bbcc700 10 moving default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 to cache LRU end 2020-04-08 00:00:00.672 7fc55bbcc700 10 updating xattr: name=ceph.objclass.version bl.length()=42 2020-04-08 00:00:00.672 7fc55bbcc700 10 updating xattr: name=user.rgw.acl bl.length()=153 2020-04-08 00:00:00.672 7fc55bbcc700 10 updating xattr: name=user.rgw.lc bl.length()=130 2020-04-08 00:00:00.672 7fc55bbcc700 20 get_system_obj_state: s->obj_tag was set empty 2020-04-08 00:00:00.672 7fc55bbcc700 10 cache get: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 : hit (requested=0x11, cached=0x17) 2020-04-08 00:00:00.672 7fc55bbcc700 20 bucket index object: .dir.0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 2020-04-08 00:00:00.683 7fc55bbcc700 0 ERROR: remove_expired_obj 2020-04-08 00:00:00.683 7fc55bbcc700 0 ERROR: remove_expired_obj 2020-04-08 00:00:00.683 7fc55bbcc700 20 lifecycle: ERROR: orule.process() returned ret=-125 2020-04-08 00:00:00.683 7fc55bbcc700 20 lifecycle: bucket_lc_process(): key=file$test.txt 2020-04-08 00:00:00.683 7fc55bbcc700 20 obj_has_expired(): mtime=2020-03-31 13:29:57.0.916797s days=1 base_time=2020-04-08 00:00:00.000000 timediff=642603 cmp=86400 2020-04-08 00:00:00.683 7fc55bbcc700 20 check(): key=file$test.txt: is_expired=1 2020-04-08 00:00:00.683 7fc55bbcc700 20 get_obj_state: rctx=0x7fc55bbc8ff8 obj=testgso:file$test.txt state=0x561aa0c64aa0 s->prefetch_data=0 2020-04-08 00:00:00.684 7fc55bbcc700 10 manifest: total_size = 12 2020-04-08 00:00:00.684 7fc55bbcc700 20 get_obj_state: setting s->obj_tag to 0a6e5079-ad65-4ea7-a868-b2133099e0d0.25454121.7731015 2020-04-08 00:00:00.684 7fc55bbcc700 10 If-UnModified-Since: 2020-03-31 13:29:57.0.916797s Last-Modified: 2020-03-31 13:29:57.000000 2020-04-08 00:00:00.684 7fc55bbcc700 20 get_obj_state: rctx=0x7fc55bbc8ff8 obj=testgso:file$test.txt state=0x561aa0c64aa0 s->prefetch_data=0 2020-04-08 00:00:00.684 7fc55bbcc700 20 prepare_atomic_modification: state is not atomic. state=0x561aa0c64aa0 2020-04-08 00:00:00.684 7fc55bbcc700 20 reading from default.rgw.meta:root:.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 2020-04-08 00:00:00.684 7fc55bbcc700 20 get_system_obj_state: rctx=0x7fc55bbc8330 obj=default.rgw.meta:root:.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 state=0x561a9d27d0c0 s->prefetch_data=0 2020-04-08 00:00:00.684 7fc55bbcc700 10 cache get: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 : hit (requested=0x16, cached=0x17) 2020-04-08 00:00:00.684 7fc55bbcc700 20 get_system_obj_state: s->obj_tag was set empty 2020-04-08 00:00:00.684 7fc55bbcc700 10 cache get: name=default.rgw.meta+root+.bucket.meta.testgso:0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 : hit (requested=0x11, cached=0x17) 2020-04-08 00:00:00.684 7fc55bbcc700 20 bucket index object: .dir.0a6e5079-ad65-4ea7-a868-b2133099e0d0.5495934.50 2020-04-08 00:00:00.711 7fc55bbcc700 0 ERROR: remove_expired_obj 2020-04-08 00:00:00.711 7fc55bbcc700 0 ERROR: remove_expired_obj 2020-04-08 00:00:00.711 7fc55bbcc700 20 lifecycle: ERROR: orule.process() returned ret=-125

4 years

1
0
0 0

[Octopus] OSD overloading

by Jack

Hello, I've a issue, since my Nautilus -> Octopus upgrade My cluster has many rbd images (~3k or something) Each of them has ~30 snapshots Each day, I create and remove a least a snapshot per image Since Octopus, when I remove the "nosnaptrim" flags, each OSDs uses 100% of its CPU time The whole cluster collapses: OSDs no longer see each others, most of them are seens as down .. I do not see any progress being made : it does not appear the problem will solve by itself What can I do ? Best regards,

4 years

8
12
0 0

Multiple OSDs down, and won't come up (possibly related to other Nautilus issues)

by aoanla＠gmail.com

So, this is following on from a discussion in the #ceph IRC channel, where we seem to have reached the limit of what we can do. I have a ~15 node, 311 OSD cluster. (20 OSDs per node). The cluster is Nautilus - the 3 MONs + the first 8 OSD hosts were installed as Mimic and upgraded to Nautilus with ceph-ansible ; the remaining OSD hosts were added directly with Nautilus as they were only added in a few weeks ago. Yesterday, suddenly, about half of the OSDs (~140) were marked Down, and a number of slow operations were detected. Initially, examining the logs (and with a bit of help from IRC), I noticed that the ansible roles used to build the newer OSDs had configured chrony incorrectly, and their clocks were drifting. (There were BADAUTHORIZER errors in OSD logs, too.) I fixed the chrony configuration... and we (including people in IRC) expected everything to just... stabilise. Things have not stabilised, which leads me to suspect that there are other issues at play. After noticing a number of issues with mgrs deadlocking in Nautilus - eg https://tracker.ceph.com/issues/17170 https://tracker.ceph.com/issues/43048 - I tried stopping all mgrs and mons, and then slowly bringing them up. This has not helped. Interestingly, the OSDs with slow ops (some of which are marked down) report ops_in_flight which are "wait for new map", whilst the lead mon believes those same ops are timed out. (I can of course, telnet to every OSD, even the down ones, from other OSDs, including ones which report issues talking to them on the same port; and from the lead mon.) I am wondering if this is an example of: https://tracker.ceph.com/issues/44184 as we did create a new pool shortly after adding the new OSD host nodes... but it isn't clear from that ticket [or the discussion on this list] how to fix this, other than removing the pool - which I can't do, as we need this pool to exist, and the pool is replaces needs to be decomissioned. Can anyone advise what I should do next? At present, obviously, the cluster is unusable.

4 years

3
5
0 0

Maximum CephFS Filesystem Size

by DHilsbos＠performair.com

All; We set up a CephFS on a Nautilus (14.2.8) cluster in February, to hold backups. We finally have all the backups running, and are just waiting for the system reach steady-state. I'm concerned about usage numbers, in the Dashboard Capacity it shows the cluster as 37% used, while under Filesystems --> <FSName> --> Pools -_> <data> --> Usage, it shows 71% used. Does CephFS place a limit on the size of a CephFS? Is there a limit to how large a pool can be in Ceph? Where is the sizing discrepancy coming from, and do I need to address it? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. DHilsbos(a)PerformAir.com www.PerformAir.com

4 years

2
2
0 0

Re: ceph mds can't recall client caps anymore

by Patrick Donnelly

On Tue, Apr 7, 2020 at 3:36 AM alean Huang <woalean(a)gmail.com> wrote: > > hi, > i am appreciate you work very much, there is a problem about ceph mds i meet. > ceph version: luminous > the mds cache grow to 20G and limit is 4G after 'stat *' from client which mount by kernel(centos7 , 4.14). > there are 1000w files in this dir. mds log show recall_client_state does not work after recall some caps. > > log is like this: > 2020-04-07 17:08:36.841304 7f23ddd15700 10 mds.2.server recall_client_state: session client.2366013 172.16.200.20:0/4162573294 caps 6801282, leases 0 > 2020-04-07 17:08:36.841323 7f23ddd15700 15 mds.2.server session recall threshold (16384) hit at 0; skipping! > 2020-04-07 17:08:36.841326 7f23ddd15700 7 mds.2.server recalled (throttled) 0 client caps. > > mds_recall_max_decay_rate = 2.5 It looks like your setting for mds_recall_max_caps is larger than mds_recall_max_decay_threshold. Are you changing these configurations? If so, why? -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

4 years

1
0
0 0

ceph-mgr Module "zabbix" cannot send Data

by i.schmidt＠langeoog.de

Hi Folks We are using Ceph as our storage backend on our 6 Node Proxmox VM Cluster. To Monitor our systems we use Zabbix and i would like to get some Ceph Data into our Zabbix to get some alarms when something goes wrong. Ceph mgr has a module, "zabbix" that uses "zabbix-sender" to actively send data, but i cannot get the module working. It always responds with "failed to send data" The network side seems to be fine: root@vm-2:~# traceroute 192.168.15.253 traceroute to 192.168.15.253 (192.168.15.253), 30 hops max, 60 byte packets 1 192.168.15.253 (192.168.15.253) 0.411 ms 0.402 ms 0.393 ms root@vm-2:~# nmap -p 10051 192.168.15.253 Starting Nmap 7.70 ( https://nmap.org ) at 2019-09-18 08:40 CEST Nmap scan report for 192.168.15.253 Host is up (0.00026s latency). PORT STATE SERVICE 10051/tcp open zabbix-trapper MAC Address: BA:F5:30:EF:40:EF (Unknown) Nmap done: 1 IP address (1 host up) scanned in 0.61 seconds root@vm-2:~# ceph zabbix config-show {"zabbix_port": 10051, "zabbix_host": "192.168.15.253", "identifier": "VM-2", "zabbix_sender": "/usr/bin/zabbix_sender", "interval": 60} root@vm-2:~# But if i try "ceph zabbix send" i get "failed to send data to zabbix" and this show up in the systems journal: Sep 18 08:41:13 vm-2 ceph-mgr[54445]: 2019-09-18 08:41:13.272 7fe360fe4700 -1 mgr.server reply reply (1) Operation not permitted The log of ceph-mgr on that machine states: 2019-09-18 08:42:18.188 7fe359fd6700 0 mgr[zabbix] Exception when sending: /usr/bin/zabbix_sender exited non-zero: zabbix_sender [3253392]: DEBUG: answer [{"response":"success","info":"processed: 0; failed: 44; total: 44; seconds spent: 0.000179"}] 2019-09-18 08:43:18.217 7fe359fd6700 0 mgr[zabbix] Exception when sending: /usr/bin/zabbix_sender exited non-zero: zabbix_sender [3253629]: DEBUG: answer [{"response":"success","info":"processed: 0; failed: 44; total: 44; seconds spent: 0.000321"}] I'm guessing, this could have something to do with user rights. But i have no idea where to start to track this down. Maybe someone here has a hint? If more information is needed, i will gladly provide it. greetings Ingo

4 years

5
6
0 0

Load on drives of different sizes in ceph

by Andras Pataki

Hi cephers, I'm looking for some advice on what to do about drives of different sizes in the same cluster. We have so far kept the drive sizes consistent on our main ceph cluster (using 8TB drives). We're getting some new hardware with larger, 12TB drives next, and I'm pondering on how best to configure them. If I just simply add them, they will have 1.5x the data (which is less of a problem), but will also get 1.5x the iops - so I presume it will slow the whole cluster down as a result (these drives will be busy, and the rest will not be as much). I'm wondering how people generally handle this. I'm more concerned about these larger drives being busier than the rest - so I'd like to be able to put for example 1/3 drive of less accessed data on them in addition to the usual data - to use the extra capacity but not increase the load on them. Is there an easy way to accomplish this? One possibility is to run two OSDs on the drive (in two crush hierarchies), which isn't ideal. Can I just run one OSD somehow and put it into two crush roots, or something similar? Andras

4 years

4
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users April 2020