February 2024 - ceph-users

Re: Recoveries without any misplaced objects?

by Hector Martin

On 29/05/2023 20.55, Anthony D'Atri wrote: > Check the uptime for the OSDs in question I restarted all my OSDs within the past 10 days or so. Maybe OSD restarts are somehow breaking these stats? > >> On May 29, 2023, at 6:44 AM, Hector Martin <marcan(a)marcan.st> wrote: >> >> Hi, >> >> I'm watching a cluster finish a bunch of backfilling, and I noticed that >> quite often PGs end up with zero misplaced objects, even though they are >> still backfilling. >> >> Right now the cluster is down to 6 backfilling PGs: >> >> data: >> volumes: 1/1 healthy >> pools: 6 pools, 268 pgs >> objects: 18.79M objects, 29 TiB >> usage: 49 TiB used, 25 TiB / 75 TiB avail >> pgs: 262 active+clean >> 6 active+remapped+backfilling >> >> But there are no misplaced objects, and the misplaced column in `ceph pg >> dump` is zero for all PGs. >> >> If I do a `ceph pg dump_json`, I can see `num_objects_recovered` >> increasing for these PGs... but the misplaced count is still 0. >> >> Is there something else that would cause recoveries/backfills other than >> misplaced objects? Or perhaps there is a bug somewhere causing the >> misplaced object count to be misreported as 0 sometimes? >> >> # ceph -v >> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy >> (stable) >> >> - Hector >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io > > - Hector

2 days, 7 hours

3
3
0 0

Status of IPv4 / IPv6 dual stack?

by Robert Sander

Hi, as the documentation sends mixed signals in https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/#ipv… "Note Binding to IPv4 is enabled by default, so if you just add the option to bind to IPv6 you’ll actually put yourself into dual stack mode." and https://docs.ceph.com/en/latest/rados/configuration/msgr2/#address-formats "Note The ability to bind to multiple ports has paved the way for dual-stack IPv4 and IPv6 support. That said, dual-stack operation is not yet supported as of Quincy v17.2.0." just the quick questions: Is a dual stacked networking with IPv4 and IPv6 now supported or not? From which version on is it considered stable? Are OSDs now able to register themselves with two IP addresses in the cluster map? MONs too? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin

3 days, 15 hours

6
6
0 0

Ceph 16.2.x mon compactions, disk writes

by Zakhar Kirpichenko

Hi, Monitors in our 16.2.14 cluster appear to quite often run "manual compaction" tasks: debug 2023-10-09T09:30:53.888+0000 7f48a329a700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1696843853892760, "job": 64225, "event": "flush_started", "num_memtables": 1, "num_entries": 715, "num_deletes": 251, "total_data_size": 3870352, "memory_usage": 3886744, "flush_reason": "Manual Compaction"} debug 2023-10-09T09:30:53.904+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:30:53.908+0000 7f48a3a9b700 4 rocksdb: (Original Log Time 2023/10/09-09:30:53.910204) [db_impl/db_impl_compaction_flush.cc:2516] [default] Manual compaction from level-0 to level-5 from 'paxos .. 'paxos; will stop at (end) debug 2023-10-09T09:30:53.908+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:30:53.908+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:30:53.908+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:30:53.908+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:30:53.908+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:30:53.908+0000 7f48a3a9b700 4 rocksdb: (Original Log Time 2023/10/09-09:30:53.911004) [db_impl/db_impl_compaction_flush.cc:2516] [default] Manual compaction from level-5 to level-6 from 'paxos .. 'paxos; will stop at (end) debug 2023-10-09T09:32:08.956+0000 7f48a329a700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1696843928961390, "job": 64228, "event": "flush_started", "num_memtables": 1, "num_entries": 1580, "num_deletes": 502, "total_data_size": 8404605, "memory_usage": 8465840, "flush_reason": "Manual Compaction"} debug 2023-10-09T09:32:08.972+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:08.976+0000 7f48a3a9b700 4 rocksdb: (Original Log Time 2023/10/09-09:32:08.977739) [db_impl/db_impl_compaction_flush.cc:2516] [default] Manual compaction from level-0 to level-5 from 'logm .. 'logm; will stop at (end) debug 2023-10-09T09:32:08.976+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:08.976+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:08.976+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:08.976+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:08.976+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:08.976+0000 7f48a3a9b700 4 rocksdb: (Original Log Time 2023/10/09-09:32:08.978512) [db_impl/db_impl_compaction_flush.cc:2516] [default] Manual compaction from level-5 to level-6 from 'logm .. 'logm; will stop at (end) debug 2023-10-09T09:32:12.764+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:12.764+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:12.764+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:12.764+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:12.764+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:32:12.764+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:33:29.028+0000 7f48a329a700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1696844009033151, "job": 64231, "event": "flush_started", "num_memtables": 1, "num_entries": 1430, "num_deletes": 251, "total_data_size": 8975535, "memory_usage": 9035920, "flush_reason": "Manual Compaction"} debug 2023-10-09T09:33:29.044+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:33:29.048+0000 7f48a3a9b700 4 rocksdb: (Original Log Time 2023/10/09-09:33:29.049585) [db_impl/db_impl_compaction_flush.cc:2516] [default] Manual compaction from level-0 to level-5 from 'paxos .. 'paxos; will stop at (end) debug 2023-10-09T09:33:29.048+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:33:29.048+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:33:29.048+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:33:29.048+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:33:29.048+0000 7f4899286700 4 rocksdb: [db_impl/db_impl_compaction_flush.cc:1443] [default] Manual compaction starting debug 2023-10-09T09:33:29.048+0000 7f48a3a9b700 4 rocksdb: (Original Log Time 2023/10/09-09:33:29.050355) [db_impl/db_impl_compaction_flush.cc:2516] [default] Manual compaction from level-5 to level-6 from 'paxos .. 'paxos; will stop at (end) I have removed a lot of interim log messages to save space. During each compaction the monitor process writes approximately 500-600 MB of data to disk over a short period of time. These writes add up to tens of gigabytes per hour and hundreds of gigabytes per day. Monitor rocksdb and compaction options are default: "mon_compact_on_bootstrap": "false", "mon_compact_on_start": "false", "mon_compact_on_trim": "true", "mon_rocksdb_options": "write_buffer_size=33554432,compression=kNoCompression,level_compaction_dynamic_level_bytes=true", Is this expected behavior? Is this something I can adjust in order to extend the system storage life? Best regards, Zakhar

1 week, 2 days

5
32
0 0

Setting S3 bucket policies with multi-tenants

by Thomas Bennett

Hi, I'm running Ceph Quincy (17.2.6) with a rados-gateway. I have muti tenants, for example: - Tenant1$manager - Tenant1$readwrite I would like to set a policy on a bucket (backups for example) owned by *Tenant1$manager* to allow *Tenant1$readwrite* access to that bucket. I can't find any documentation that discusses this scenario. Does anyone know how to specify the Principle and Resource section of a policy.json file? Or any other configuration that I might be missing? I've tried some variations on Principal and Resource including and excluding tenant information, but not no luck yet. For example: { "Version": "2012-10-17", "Statement": [{ "Effect": "Allow", "Principal": {"AWS": ["arn:aws:iam:::user/*Tenant1$readwrite*"]}, "Action": ["s3:ListBucket","s3:GetObject", ,"s3:PutObject"], "Resource": [ "arn:aws:s3:::*Tenant1/backups*" ] }] } I'm using s3cmd for testing, so: s3cmd --config s3cfg.manager setpolicy policy.json s3://backups/ Returns: s3://backups/: Policy updated And then testing: s3cmd --config s3cfg.readwrite ls s3://backups/ ERROR: Access to bucket 'backups' was denied ERROR: S3 error: 403 (AccessDenied) Thanks, Tom

1 week, 4 days

4
4
0 0

A couple OSDs not starting after host reboot

by Alison Peisker

Hi all, We rebooted all the nodes in our 17.2.5 cluster after performing kernel updates, but 2 of the OSDs on different nodes are not coming back up. This is a production cluster using cephadm. The error message from the OSD log is ceph-osd[87340]: ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-665: (2) No such file or directory The error message from ceph-volume is 2023-08-23T16:12:43.452-0500 7f0cad968600 2 bluestore(/dev/mapper/ceph--febad5a5--ba44--41aa--a39e--b9897f757752-osd--block--87e548f4--b9b5--4ed8--aca8--de703a341a50) _read_bdev_label unable to decode label at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input We tried restarting the daemons and rebooting the node again, but still see the same error. Has anyone experienced this issue before? How do we fix this? Thanks, Alison

2 weeks, 1 day

9
13
0 0

cephfs creation error

by Ramanathan S

Hi all, I just had created a ceph cluster to use cephfs. When i create the a ceph fs pool i get the filesystem below error. # ceph osd pool create cephfs_data 128 pool 'cephfs_data' created # ceph osd pool create cephfs_metadata 128 pool 'cephfs_metadata' created # ceph fs new cephfs cephfs_metadata cephfs_data new fs with metadata pool 6 and data pool 5 # ceph -s cluster: id: 1c27def45-f0f9-494d-sfke-eb4323432fd health: HEALTH_ERR 1 filesystem is offline 1 filesystem is online with fewer MDS than max_mds services: mon: 2 daemons, quorum ceph-mon01,ceph-mon02 mgr: ceph-adm01(active) mds: cephfs-0/0/1 up osd: 12 osds: 12 up, 12 in data: pools: 2 pools, 256 pgs objects: 0 objects, 0 B usage: 12 GiB used, 588 GiB / 600 GiB avail pgs: 256 active+clean but when i check the max_mds for the ceph fs it says 1 # ceph fs get cephfs | grep max_mds max_mds 1 Let anyone know what am i missing here? Any inputs is much appreciated. Regards, Ram Ceph-explorer..

3 weeks, 3 days

3
3
0 0

Re: NoSuchKey on key that is visible in s3 list/radosgw bk

by Eric Ivancich

I have some questions for those who’ve experienced this issue. 1. It seems like those reporting this issue are seeing it strictly after upgrading to Octopus. From what version did each of these sites upgrade to Octopus? From Nautilus? Mimic? Luminous? 2. Does anyone have any lifecycle rules on a bucket experiencing this issue? If so, please describe. 3. Is anyone making copies of the affected objects (to same or to a different bucket) prior to seeing the issue? And if they are making copies, does the destination bucket have lifecycle rules? And if they are making copies, are those copies ever being removed? 4. Is anyone experiencing this issue willing to run their RGWs with 'debug_ms=1'? That would allow us to see a request from an RGW to either remove a tail object or decrement its reference counter (and when its counter reaches 0 it will be deleted). Thanks, Eric > On Nov 12, 2020, at 4:54 PM, huxiaoyu(a)horebdata.cn wrote: > > Looks like this is a very dangerous bug for data safety. Hope the bug would be quickly identified and fixed. > > best regards, > > Samuel > > > > huxiaoyu(a)horebdata.cn <mailto:huxiaoyu@horebdata.cn> > > From: Janek Bevendorff > Date: 2020-11-12 18:17 > To: huxiaoyu(a)horebdata.cn <mailto:huxiaoyu@horebdata.cn>; EDH - Manuel Rios; Rafael Lopez > CC: Robin H. Johnson; ceph-users > Subject: Re: [ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk > I have never seen this on Luminous. I recently upgraded to Octopus and the issue started occurring only few weeks later. > > On 12/11/2020 16:37, huxiaoyu(a)horebdata.cn wrote: > which Ceph versions are affected by this RGW bug/issues? Luminous, Mimic, Octupos, or the latest? > > any idea? > > samuel > > > > huxiaoyu(a)horebdata.cn > > From: EDH - Manuel Rios > Date: 2020-11-12 14:27 > To: Janek Bevendorff; Rafael Lopez > CC: Robin H. Johnson; ceph-users > Subject: [ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk > This same error caused us to wipe a full cluster of 300TB... will be related to some rados index/database bug not to s3. > > As Janek exposed is a mayor issue, because the error silent happend and you can only detect it with S3, when you're going to delete/purge a S3 bucket. Dropping NoSuchKey. Error is not related to S3 logic .. > > Hope this time dev's can take enought time to find and resolve the issue. Error happens with low ec profiles, even with replica x3 in some cases. > > Regards > > > > -----Mensaje original----- > De: Janek Bevendorff <janek.bevendorff(a)uni-weimar.de <mailto:janek.bevendorff@uni-weimar.de>> > Enviado el: jueves, 12 de noviembre de 2020 14:06 > Para: Rafael Lopez <rafael.lopez(a)monash.edu <mailto:rafael.lopez@monash.edu>> > CC: Robin H. Johnson <robbat2(a)gentoo.org <mailto:robbat2@gentoo.org>>; ceph-users <ceph-users(a)ceph.io <mailto:ceph-users@ceph.io>> > Asunto: [ceph-users] Re: NoSuchKey on key that is visible in s3 list/radosgw bk > > Here is a bug report concerning (probably) this exact issue: > https://tracker.ceph.com/issues/47866 <https://tracker.ceph.com/issues/47866> > > I left a comment describing the situation and my (limited) experiences with it. > > > On 11/11/2020 10:04, Janek Bevendorff wrote: >> >> Yeah, that seems to be it. There are 239 objects prefixed >> .8naRUHSG2zfgjqmwLnTPvvY1m6DZsgh in my dump. However, there are none >> of the multiparts from the other file to be found and the head object >> is 0 bytes. >> >> I checked another multipart object with an end pointer of 11. >> Surprisingly, it had way more than 11 parts (39 to be precise) named >> .1, .1_1 .1_2, .1_3, etc. Not sure how Ceph identifies those, but I >> could find them in the dump at least. >> >> I have no idea why the objects disappeared. I ran a Spark job over all >> buckets, read 1 byte of every object and recorded errors. Of the 78 >> buckets, two are missing objects. One bucket is missing one object, >> the other 15. So, luckily, the incidence is still quite low, but the >> problem seems to be expanding slowly. >> >> >> On 10/11/2020 23:46, Rafael Lopez wrote: >>> Hi Janek, >>> >>> What you said sounds right - an S3 single part obj won't have an S3 >>> multipart string as part of the prefix. S3 multipart string looks >>> like "2~m5Y42lPMIeis5qgJAZJfuNnzOKd7lme". >>> >>> From memory, single part S3 objects that don't fit in a single rados >>> object are assigned a random prefix that has nothing to do with >>> the object name, and the rados tail/data objects (not the head >>> object) have that prefix. >>> As per your working example, the prefix for that would be >>> '.8naRUHSG2zfgjqmwLnTPvvY1m6DZsgh'. So there would be (239) "shadow" >>> objects with names containing that prefix, and if you add up the >>> sizes it should be the size of your S3 object. >>> >>> You should look at working and non working examples of both single >>> and multipart S3 objects, as they are probably all a bit different >>> when you look in rados. >>> >>> I agree it is a serious issue, because once objects are no longer in >>> rados, they cannot be recovered. If it was a case that there was a >>> link broken or rados objects renamed, then we could work to >>> recover...but as far as I can tell, it looks like stuff is just >>> vanishing from rados. The only explanation I can think of is some >>> (rgw or rados) background process is incorrectly doing something with >>> these objects (eg. renaming/deleting). I had thought perhaps it was a >>> bug with the rgw garbage collector..but that is pure speculation. >>> >>> Once you can articulate the problem, I'd recommend logging a bug >>> tracker upstream. >>> >>> >>> On Wed, 11 Nov 2020 at 06:33, Janek Bevendorff >>> <janek.bevendorff(a)uni-weimar.de <mailto:janek.bevendorff@uni-weimar.de> >>> <mailto:janek.bevendorff@uni-weimar.de <mailto:janek.bevendorff@uni-weimar.de>>> wrote: >>> >>> Here's something else I noticed: when I stat objects that work >>> via radosgw-admin, the stat info contains a "begin_iter" JSON >>> object with RADOS key info like this >>> >>> >>> "key": { >>> "name": >>> "29/items/WIDE-20110924034843-crawl420/WIDE-20110924065228-02544.warc.gz", >>> "instance": "", >>> "ns": "" >>> } >>> >>> >>> and then "end_iter" with key info like this: >>> >>> >>> "key": { >>> "name": >>> ".8naRUHSG2zfgjqmwLnTPvvY1m6DZsgh_239", >>> "instance": "", >>> "ns": "shadow" >>> } >>> >>> However, when I check the broken 0-byte object, the "begin_iter" >>> and "end_iter" keys look like this: >>> >>> >>> "key": { >>> "name": >>> "29/items/WIDE-20110903143858-crawl428/WIDE-20110903143858-01166.warc.gz.2~m5Y42lPMIeis5qgJAZJfuNnzOKd7lme.1", >>> "instance": "", >>> "ns": "multipart" >>> } >>> >>> [...] >>> >>> >>> "key": { >>> "name": >>> "29/items/WIDE-20110903143858-crawl428/WIDE-20110903143858-01166.warc.gz.2~m5Y42lPMIeis5qgJAZJfuNnzOKd7lme.19", >>> "instance": "", >>> "ns": "multipart" >>> } >>> >>> So, it's the full name plus a suffix and the namespace is >>> multipart, not shadow (or empty). This in itself may just be an >>> artefact of whether the object was uploaded in one go or as a >>> multipart object, but the second difference is that I cannot find >>> any of the multipart objects in my pool's object name dump. I >>> can, however, find the shadow RADOS object of the intact S3 object. >>> >>> >>> >>> >>> -- >>> *Rafael Lopez* >>> Devops Systems Engineer >>> Monash University eResearch Centre >>> >>> T: +61 3 9905 9118 <tel:%2B61%203%209905%209118> >>> E: rafael.lopez(a)monash.edu <mailto:rafael.lopez@monash.edu> >>> > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io

3 weeks, 3 days

6
22
0 0

Pacific Bug?

by Alex

Hello Ceph Gurus! I'm running Ceph Pacific version. if I run ceph orch host ls --label osds shows all hosts label osds or ceph orch host ls --host-pattern host1 shows just host1 it works as expected But combining the two the label tag seems to "take over" ceph orch host ls --label osds --host-pattern host1 6 hosts in cluster who had label osds whose hostname matched host1 shows all host with the label osds instead of only host1. So at first the flags seem to act like an OR instead of an AND. ceph orch host ls --label osds --host-pattern foo 6 hosts in cluster who had label osds whose hostname matched foo even though "foo" doesn't even exist ceph orch host ls --label bar --host-pattern host1 0 hosts in cluster who had label bar whose hostname matched host1 if the label and host combo was an OR this should have worked there is no label bar but host1 exists so it just disregards the host-pattern. This started because the osd deployment task had both label and host_pattern. The cluster was attempting to deploy OSDS on all the servers with the given tag instead of the one host we needed, which caused it to go into warning state. If I ran ceph orch ls --export --service_name host1 it also showed both tags and host_pattern. unmanaged: false placement: host_pattern: label: The issue persisted until I removed the label tag. Thanks.

3 weeks, 3 days

3
5
0 0

cephfs inode backtrace information

by Dietmar Rieder

Hello, I have a question regarding the default pool of a cephfs. According to the docs it is recommended to use a fast ssd replicated pool as default pool for cephfs. I'm asking what are the space requirements for storing the inode backtrace information? Let's say I have a 85 TiB replicated ssd pool (hot data) and as 3 PiB EC data pool (cold data). Does it make sense to create a third pool as default pool which only holds the inode backtrace information (what would be a good size), or is it OK to use the ssd pool as default pool? Thanks Dietmar

3 weeks, 3 days

6
10
0 0

Improving CephFS performance by always putting "default" data pool on SSDs?

by Niklas Hambüchen

https://docs.ceph.com/en/reef/cephfs/createfs/ says: > The data pool used to create the file system is the “default” data pool and the location for storing all inode backtrace information, which is used for hard link management and disaster recovery. > For this reason, all CephFS inodes have at least one object in the default data pool. If erasure-coded pools are planned for file system data, it is best to configure the default as a replicated pool to improve small-object write and read performance when updating backtraces. This poses the question: Are normal replicated CephFS installations (metadata on SSDs, data on HDDs) set up with suboptimal performance because they don't do this? If having inodes/backtraces on replicated instead of EC improves performance, shouldn't one expect that putting inodes/backtraces on SSD would improve it even more? From the docs I also cannot really conclude when inotes/backtraces become important. Is that all the time, or only sometimes? Thanks!

4 weeks, 1 day

2
3
0 0

2024

2023

2022

2021

2020

2019

ceph-users February 2024