ceph-users February 2020

ceph-users@ceph.io

147 participants
176 discussions

by Philipp Schwaha

We have a small ceph cluster running built from components that were phased out from compute applications. the current cluster consists of i7-860s. 6 disks (5TB, 7200RPM) per node and 8 nodes totaling 48 OSDs. A compute cluster will be discontinued, which will make Ryzen 5-1600 hardware available (8 nodes with 16GB RAM each) with which to replace the CPUs of the current setup. How could we best distribute the OSDs (keeping existing disks for storage) to the Ryzen systems to get a good performance improvement? Unfortunately the interconnect is still only 1Gb/s so expected to be a limiting factor. Would it make sense to create fewer bigger nodes, e.g., to use 6 nodes with 8 disks each or even more condensed? We would like to move the Luminous cluster to Nautilus/Bluestore and can get SSDs for each of the nodes, as it appears to be essential to get performance. Can actually benefit from improvements in the OSDs if the network is so limited? Would bonding of network interfaces be a workaround until we can get a network update or are we overestimating the power of the upgraded OSD nodes? What strategy would you suggest with these resources? Any comments and suggestions would be highly welcome :) Thanks in advance Philipp

4 years, 2 months

CephFS - objects in default data pool

by CASS Philip

I have a query about https://docs.ceph.com/docs/master/cephfs/createfs/: "The data pool used to create the file system is the "default" data pool and the location for storing all inode backtrace information, used for hard link management and disaster recovery. For this reason, all inodes created in CephFS have at least one object in the default data pool." This does not match my experience (nautilus servers, nautlius FUSE client or Centos 7 kernel client). I have a cephfs with a replicated top-level pool and a directory set to use erasure coding with setfattr, though I also did the same test using the subvolume commands with the same result. "Ceph df detail" shows no objects used in the top level pool, as shown in https://gist.github.com/pcass-epcc/af24081cf014a66809e801f33bcb535b (also displayed in-line below) It would be useful if indeed clients didn't have to write to the top-level pool, since that would mean we could give different clients permission only to pool-associated subdirectories without giving everyone write access to a pool with data structures shared between all users of the filesystem. [root@hdr-admon01 ec]# ceph df detail; ceph fs ls; ceph fs status RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 3.3 PiB 3.3 PiB 32 TiB 32 TiB 0.95 nvme 2.9 TiB 2.9 TiB 504 MiB 2.5 GiB 0.08 TOTAL 3.3 PiB 3.3 PiB 32 TiB 32 TiB 0.95 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR cephfs.fs1.metadata 5 162 MiB 63 324 MiB 0.01 1.4 TiB N/A N/A 63 0 B 0 B cephfs.fs1-replicated.data 6 0 B 0 0 B 0 1.0 PiB N/A N/A 0 0 B 0 B cephfs.fs1-ec.data 7 8.0 GiB 2.05k 11 GiB 0 2.4 PiB N/A N/A 2.05k 0 B 0 B name: fs1, metadata pool: cephfs.fs1.metadata, data pools: [cephfs.fs1-replicated.data cephfs.fs1-ec.data ] fs1 - 4 clients === +------+--------+------------+---------------+-------+-------+ | Rank | State | MDS | Activity | dns | inos | +------+--------+------------+---------------+-------+-------+ | 0 | active | hdr-meta02 | Reqs: 0 /s | 29 | 16 | +------+--------+------------+---------------+-------+-------+ +----------------------------+----------+-------+-------+ | Pool | type | used | avail | +----------------------------+----------+-------+-------+ | cephfs.fs1.metadata | metadata | 324M | 1414G | | cephfs.fs1-replicated.data | data | 0 | 1063T | | cephfs.fs1-ec.data | data | 11.4G | 2505T | +----------------------------+----------+-------+-------+ +-------------+ | Standby MDS | +-------------+ | hdr-meta01 | +-------------+ MDS version: ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable) [root@hdr-admon01 ec]# ll /test-fs/ec/ total 12582912 -rw-r--r--. 1 root root 4294967296 Jan 27 22:26 new-file -rw-r--r--. 2 root root 4294967296 Jan 28 14:06 new-file2 -rw-r--r--. 2 root root 4294967296 Jan 28 14:06 new-file-same-inode-as-newfile2 Regards, Phil _________________________________________ Philip Cass HPC Systems Specialist - Senior Systems Administrator EPCC [cid:image002.png@01D5D5EF.2E463230] Advanced Computing Facility Bush Estate Penicuik Tel: +44 (0)131 4457815 Email: p.cass(a)epcc.ed.ac.uk<mailto:p.cass@epcc.ed.ac.uk> _________________________________________ The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. The information contained in this e-mail (including any attachments) is confidential and is intended for the use of the addressee only. If you have received this message in error, please delete it and notify the originator immediately. Please consider the environment before printing this email.

4 years, 2 months

Changing failure domain

by Francois Legrand

Hi, I have a cephfs in production based on 2 pools (data+metadata). Data is in erasure coding with the profile : crush-failure-domain=host crush-root=default jerasure-per-chunk-alignment=false k=3 m=2 plugin=jerasure technique=reed_sol_van w=8 Metadata is in replicated mode with k=3 The crush rules are as follow : [ { "rule_id": 0, "rule_name": "replicated_rule", "ruleset": 0, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_firstn", "num": 0, "type": "host" }, { "op": "emit" } ] }, { "rule_id": 1, "rule_name": "ec_data", "ruleset": 1, "type": 3, "min_size": 3, "max_size": 5, "steps": [ { "op": "set_chooseleaf_tries", "num": 5 }, { "op": "set_choose_tries", "num": 100 }, { "op": "take", "item": -1, "item_name": "default" }, { "op": "chooseleaf_indep", "num": 0, "type": "host" }, { "op": "emit" } ] } ] When we installed it, everything was in the same room, but know we splitted our cluster (6 servers but soon 8) in 2 rooms. Thus we updated the crushmap by adding a room layer (with ceph osd crush add-bucket room1 room etc) and move all our servers in the tree to the correct place (ceph osd crush move server1 room=room1 etc...). Now, we would like to change the rules to set a failure domain to room instead of host (to be sure that in case of disaster in one of the rooms we will still have a copy in the other). What is the best strategy to do this ? F.

4 years, 2 months

General question CephFS or RBD

by Willi Schiegel

Hello All, I have a HW RAID based 240 TB data pool with about 200 million files for users in a scientific institution. Data sizes range from tiny parameter files for scientific calculations and experiments to huge images of brain scans. There are group directories, home directories, Windows roaming profile directories organized in ZFS pools on Solaris operating systems, exported via NFS and Samba to Linux, macOS, and Windows clients. I would like to switch to CephFS because of the flexibility and expandability but I cannot find any recommendations for which storage backend would be suitable for all the functionality we have. Since I like the features of ZFS like immediate snapshots of very large data pools, quotas for each file system within hierarchical data trees and dynamic expandability by simply adding new disks or disk images without manual resizing would it be a good idea to create RBD images, map them onto the file servers and create zpools on the mapped images? I know that ZFS best works with raw disks but maybe a RBD image is close enough to a raw disk? Or would CephFS be the way to go? Can there be multiple CephFS pools for the group data folders and for the user's home directory folders for example or do I have to have everything in one single file space? Maybe someone can share his or her field experience? Thank you very much. Best regards Willi

4 years, 2 months

Getting rid of trim_object Snap .... not in clones

by Andreas John

Hello, in my cluster one after the other OSD dies until I recognized that it was simply an "abort" in the daemon caused probably by 2020-01-31 15:54:42.535930 7faf8f716700 -1 log_channel(cluster) log [ERR] : trim_object Snap 29c44 not in clones Close to this msg I get a stracktrace: ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af) 1: /usr/bin/ceph-osd() [0xb35f7d] 2: (()+0x11390) [0x7f0fec74b390] 3: (gsignal()+0x38) [0x7f0feab43428] 4: (abort()+0x16a) [0x7f0feab4502a] 5: (__gnu_cxx::__verbose_terminate_handler()+0x16d) [0x7f0feb48684d] 6: (()+0x8d6b6) [0x7f0feb4846b6] 7: (()+0x8d701) [0x7f0feb484701] 8: (()+0x8d919) [0x7f0feb484919] 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x27e) [0xc3776e] 10: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x10dd) [0x868cfd] 11: (ReplicatedPG::repop_all_committed(ReplicatedPG::RepGather*)+0x80) [0x8690e0] 12: (Context::complete(int)+0x9) [0x6c8799] 13: (void ReplicatedBackend::sub_op_modify_reply<MOSDRepOpReply, 113>(std::tr1::shared_ptr<OpRequest>)+0x21b) [0xa5ae0b] 14: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x15b) [0xa53edb] 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x1cb) [0x84c78b] 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3ef) [0x6966ff] 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x4e4) [0x696e14] 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x71e) [0xc264fe] 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xc29950] 20: (()+0x76ba) [0x7f0fec7416ba] 21: (clone()+0x6d) [0x7f0feac1541d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Yes, I know it's still hammer, I want to upgrade soon, but I want to resolve that issue first. If I lose that PG, I don't worry. So: What it the best approach? Can I use something like ceph-objectstore-tool ... <object> remove-clone-metadata <cloneid> ? I assume 29c44 is my Object, but what's the clone od? Best regards, derjohn

4 years, 2 months

Near Perfect PG distrubtion apart from two OSD

by Ashley Merrick

So I recently updated CEPH and rebooted the OSD node's the two OSD's are now even more unbalanced and CEPH is actually currently moving more PG's too the OSD's in question (OSD 7,8), any ideas? ceph balancer status { "last_optimize_duration": "0:00:00.000289", "plans": [], "mode": "upmap", "active": true, "optimize_result": "Too many objects (0.008051 > 0.002000) are misplaced; try again later", "last_optimize_started": "Sat Feb 1 12:00:01 2020" } data: pools: 3 pools, 613 pgs objects: 35.12M objects, 130 TiB usage: 183 TiB used, 90 TiB / 273 TiB avail pgs: 2826030/351190811 objects misplaced (0.805%) 580 active+clean 29 active+remapped+backfill_wait 4 active+remapped+backfilling ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 23 hdd 0.00999 1.00000 10 GiB 1.4 GiB 460 MiB 1.5 MiB 1023 MiB 8.5 GiB 14.50 0.22 33 up 24 hdd 0.00999 1.00000 10 GiB 1.5 GiB 467 MiB 24 KiB 1024 MiB 8.5 GiB 14.57 0.22 34 up 25 hdd 0.00999 1.00000 10 GiB 1.5 GiB 462 MiB 28 KiB 1024 MiB 8.5 GiB 14.52 0.22 34 up 26 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.53 0.22 34 up 27 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 40 KiB 1024 MiB 8.5 GiB 14.53 0.22 33 up 28 hdd 0.00999 1.00000 10 GiB 1.5 GiB 462 MiB 12 KiB 1024 MiB 8.5 GiB 14.52 0.22 34 up 3 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 76 KiB 19 GiB 3.0 TiB 67.06 1.00 170 up 4 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 32 KiB 18 GiB 3.2 TiB 64.57 0.96 164 up 5 hdd 9.09599 1.00000 9.1 TiB 6.4 TiB 6.4 TiB 44 KiB 19 GiB 2.7 TiB 70.77 1.06 180 up 6 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.1 TiB 20 KiB 19 GiB 2.9 TiB 67.65 1.01 171 up 7 hdd 9.09599 1.00000 9.1 TiB 7.0 TiB 7.0 TiB 8 KiB 21 GiB 2.1 TiB 77.19 1.15 196 up 8 hdd 9.09599 1.00000 9.1 TiB 6.6 TiB 6.5 TiB 56 KiB 20 GiB 2.5 TiB 72.04 1.07 183 up 9 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 72 KiB 19 GiB 3.1 TiB 66.21 0.99 168 up 10 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.0 TiB 8 KiB 18 GiB 3.0 TiB 66.63 0.99 168 up 11 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 92 KiB 19 GiB 3.0 TiB 67.42 1.01 171 up 12 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 4 KiB 19 GiB 3.0 TiB 66.92 1.00 169 up 13 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 80 KiB 19 GiB 3.0 TiB 66.80 1.00 169 up 14 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.0 TiB 12 KiB 19 GiB 3.0 TiB 66.62 0.99 169 up 15 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 5.9 TiB 64 KiB 19 GiB 3.1 TiB 65.60 0.98 165 up 16 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 96 KiB 20 GiB 3.0 TiB 67.02 1.00 170 up 17 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 44 KiB 19 GiB 3.1 TiB 66.13 0.99 168 up 18 hdd 9.09599 1.00000 9.1 TiB 6.3 TiB 6.3 TiB 12 KiB 20 GiB 2.8 TiB 69.36 1.03 176 up 19 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.2 TiB 60 KiB 19 GiB 2.9 TiB 67.87 1.01 173 up 20 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.1 TiB 48 KiB 19 GiB 2.9 TiB 67.77 1.01 171 up 21 hdd 9.09599 1.00000 9.1 TiB 6.3 TiB 6.3 TiB 52 KiB 20 GiB 2.8 TiB 68.96 1.03 175 up 22 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 16 KiB 19 GiB 3.0 TiB 67.15 1.00 170 up 29 hdd 9.09599 1.00000 9.1 TiB 5.8 TiB 5.8 TiB 80 KiB 18 GiB 3.3 TiB 63.48 0.95 163 up 30 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 40 KiB 18 GiB 3.2 TiB 64.80 0.97 167 up 31 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.2 TiB 152 KiB 19 GiB 2.9 TiB 68.29 1.02 175 up 32 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.0 TiB 128 KiB 18 GiB 3.0 TiB 66.55 0.99 171 up 33 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 48 KiB 19 GiB 3.0 TiB 67.19 1.00 173 up 34 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 60 KiB 18 GiB 3.0 TiB 66.90 1.00 172 up 35 hdd 9.09599 1.00000 9.1 TiB 5.8 TiB 5.8 TiB 52 KiB 18 GiB 3.3 TiB 64.22 0.96 165 up 36 hdd 9.09599 1.00000 9.1 TiB 5.4 TiB 5.4 TiB 128 KiB 17 GiB 3.7 TiB 59.20 0.88 152 up 37 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 80 KiB 18 GiB 3.1 TiB 66.10 0.99 170 up 38 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 84 KiB 19 GiB 3.2 TiB 64.65 0.96 166 up 0 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 3 KiB 1024 MiB 8.5 GiB 14.54 0.22 34 up 1 hdd 0.00999 1.00000 10 GiB 1.4 GiB 460 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.50 0.22 34 up 2 hdd 0.00999 1.00000 10 GiB 1.5 GiB 466 MiB 24 KiB 1024 MiB 8.5 GiB 14.55 0.22 33 up 23 hdd 0.00999 1.00000 10 GiB 1.4 GiB 460 MiB 1.5 MiB 1023 MiB 8.5 GiB 14.50 0.22 33 up 24 hdd 0.00999 1.00000 10 GiB 1.5 GiB 467 MiB 24 KiB 1024 MiB 8.5 GiB 14.57 0.22 34 up 25 hdd 0.00999 1.00000 10 GiB 1.5 GiB 462 MiB 28 KiB 1024 MiB 8.5 GiB 14.52 0.22 34 up 26 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 1.4 MiB 1023 MiB 8.5 GiB 14.53 0.22 34 up 27 hdd 0.00999 1.00000 10 GiB 1.5 GiB 464 MiB 40 KiB 1024 MiB 8.5 GiB 14.53 0.22 33 up 28 hdd 0.00999 1.00000 10 GiB 1.5 GiB 462 MiB 12 KiB 1024 MiB 8.5 GiB 14.52 0.22 34 up TOTAL 273 TiB 183 TiB 182 TiB 6.1 MiB 574 GiB 90 TiB 67.02 MIN/MAX VAR: 0.22/1.15 STDDEV: 30.40 ---- On Fri, 10 Jan 2020 14:57:05 +0800 Ashley Merrick <mailto:singapore@amerrick.co.uk> wrote ---- Hey, I have a cluster of 30 OSD's that is near perfect distribution minus two OSD's. I am running ceph version 14.2.6 however has been the same for the previous versions, I have the balance module enabled in upmap and it says no improvements, I have also tried in crush mode. ceph balancer status { "last_optimize_duration": "0:00:01.123659", "plans": [], "mode": "upmap", "active": true, "optimize_result": "Unable to find further optimization, or pool(s)' pg_num is decreasing, or distribution is already perfect", "last_optimize_started": "Fri Jan 10 06:11:08 2020" } I have read a few email threads on the ML recently about similar cases but not sure if I am hitting the same "bug" as its only two that are off the rest are almost perfect. ceph osd df ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS 23 hdd 0.00999 1.00000 10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21 33 up 24 hdd 0.00999 1.00000 10 GiB 1.4 GiB 441 MiB 48 KiB 1024 MiB 8.6 GiB 14.31 0.21 34 up 25 hdd 0.00999 1.00000 10 GiB 1.4 GiB 435 MiB 24 KiB 1024 MiB 8.6 GiB 14.26 0.21 34 up 26 hdd 0.00999 1.00000 10 GiB 1.4 GiB 436 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.27 0.21 34 up 27 hdd 0.00999 1.00000 10 GiB 1.4 GiB 437 MiB 16 KiB 1024 MiB 8.6 GiB 14.27 0.21 33 up 28 hdd 0.00999 1.00000 10 GiB 1.4 GiB 436 MiB 36 KiB 1024 MiB 8.6 GiB 14.26 0.21 34 up 3 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 76 KiB 19 GiB 3.0 TiB 67.26 1.00 170 up 4 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.1 TiB 44 KiB 19 GiB 2.9 TiB 67.77 1.01 172 up 5 hdd 9.09599 1.00000 9.1 TiB 6.3 TiB 6.3 TiB 112 KiB 20 GiB 2.8 TiB 69.50 1.03 176 up 6 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 17 KiB 19 GiB 2.9 TiB 67.58 1.01 171 up 7 hdd 9.09599 1.00000 9.1 TiB 6.7 TiB 6.7 TiB 88 KiB 21 GiB 2.4 TiB 73.98 1.10 187 up 8 hdd 9.09599 1.00000 9.1 TiB 6.5 TiB 6.5 TiB 76 KiB 20 GiB 2.6 TiB 71.84 1.07 182 up 9 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 120 KiB 19 GiB 3.0 TiB 67.24 1.00 170 up 10 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 72 KiB 19 GiB 3.0 TiB 67.19 1.00 170 up 11 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.2 TiB 40 KiB 19 GiB 2.9 TiB 68.06 1.01 172 up 12 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 28 KiB 19 GiB 3.0 TiB 67.48 1.00 170 up 13 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 36 KiB 19 GiB 3.0 TiB 67.04 1.00 170 up 14 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 108 KiB 19 GiB 3.0 TiB 67.30 1.00 170 up 15 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 68 KiB 19 GiB 3.0 TiB 67.41 1.00 170 up 16 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 152 KiB 19 GiB 2.9 TiB 67.61 1.01 171 up 17 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 36 KiB 19 GiB 3.0 TiB 67.16 1.00 170 up 18 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 41 KiB 19 GiB 3.0 TiB 67.19 1.00 170 up 19 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 64 KiB 19 GiB 3.0 TiB 67.49 1.00 171 up 20 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 12 KiB 19 GiB 3.0 TiB 67.55 1.01 171 up 21 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.1 TiB 76 KiB 19 GiB 2.9 TiB 67.76 1.01 171 up 22 hdd 9.09599 1.00000 9.1 TiB 6.2 TiB 6.2 TiB 12 KiB 19 GiB 2.9 TiB 68.05 1.01 172 up 29 hdd 9.09599 1.00000 9.1 TiB 5.8 TiB 5.8 TiB 108 KiB 17 GiB 3.3 TiB 63.59 0.95 163 up 30 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 24 KiB 18 GiB 3.2 TiB 65.18 0.97 167 up 31 hdd 9.09599 1.00000 9.1 TiB 6.1 TiB 6.1 TiB 44 KiB 18 GiB 3.0 TiB 66.74 0.99 171 up 32 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 220 KiB 18 GiB 3.1 TiB 66.31 0.99 170 up 33 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 5.9 TiB 36 KiB 18 GiB 3.1 TiB 65.54 0.98 168 up 34 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 44 KiB 18 GiB 3.1 TiB 66.33 0.99 170 up 35 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 68 KiB 18 GiB 3.2 TiB 64.77 0.96 166 up 36 hdd 9.09599 1.00000 9.1 TiB 5.8 TiB 5.8 TiB 168 KiB 17 GiB 3.3 TiB 63.60 0.95 163 up 37 hdd 9.09599 1.00000 9.1 TiB 6.0 TiB 6.0 TiB 60 KiB 18 GiB 3.1 TiB 65.91 0.98 169 up 38 hdd 9.09599 1.00000 9.1 TiB 5.9 TiB 5.9 TiB 68 KiB 18 GiB 3.2 TiB 65.15 0.97 167 up 0 hdd 0.00999 1.00000 10 GiB 1.4 GiB 437 MiB 28 KiB 1024 MiB 8.6 GiB 14.27 0.21 34 up 1 hdd 0.00999 1.00000 10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21 34 up 2 hdd 0.00999 1.00000 10 GiB 1.4 GiB 439 MiB 36 KiB 1024 MiB 8.6 GiB 14.29 0.21 33 up 23 hdd 0.00999 1.00000 10 GiB 1.4 GiB 434 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.24 0.21 33 up 24 hdd 0.00999 1.00000 10 GiB 1.4 GiB 441 MiB 48 KiB 1024 MiB 8.6 GiB 14.31 0.21 34 up 25 hdd 0.00999 1.00000 10 GiB 1.4 GiB 435 MiB 24 KiB 1024 MiB 8.6 GiB 14.26 0.21 34 up 26 hdd 0.00999 1.00000 10 GiB 1.4 GiB 436 MiB 1.4 MiB 1023 MiB 8.6 GiB 14.27 0.21 34 up 27 hdd 0.00999 1.00000 10 GiB 1.4 GiB 437 MiB 16 KiB 1024 MiB 8.6 GiB 14.27 0.21 33 up 28 hdd 0.00999 1.00000 10 GiB 1.4 GiB 436 MiB 36 KiB 1024 MiB 8.6 GiB 14.26 0.21 34 up TOTAL 273 TiB 183 TiB 183 TiB 6.4 MiB 567 GiB 90 TiB 67.17 ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -12 0.05798 root default -11 0.02899 host sn-m02 23 hdd 0.00999 osd.23 up 1.00000 1.00000 24 hdd 0.00999 osd.24 up 1.00000 1.00000 25 hdd 0.00999 osd.25 up 1.00000 1.00000 -15 0.02899 host sn-m03 26 hdd 0.00999 osd.26 up 1.00000 1.00000 27 hdd 0.00999 osd.27 up 1.00000 1.00000 28 hdd 0.00999 osd.28 up 1.00000 1.00000 -6 272.87100 root ec -5 90.95700 host sn-s01 3 hdd 9.09599 osd.3 up 1.00000 1.00000 4 hdd 9.09599 osd.4 up 1.00000 1.00000 5 hdd 9.09599 osd.5 up 1.00000 1.00000 6 hdd 9.09599 osd.6 up 1.00000 1.00000 7 hdd 9.09599 osd.7 up 1.00000 1.00000 8 hdd 9.09599 osd.8 up 1.00000 1.00000 9 hdd 9.09599 osd.9 up 1.00000 1.00000 10 hdd 9.09599 osd.10 up 1.00000 1.00000 11 hdd 9.09599 osd.11 up 1.00000 1.00000 12 hdd 9.09599 osd.12 up 1.00000 1.00000 -9 90.95700 host sn-s02 13 hdd 9.09599 osd.13 up 1.00000 1.00000 14 hdd 9.09599 osd.14 up 1.00000 1.00000 15 hdd 9.09599 osd.15 up 1.00000 1.00000 16 hdd 9.09599 osd.16 up 1.00000 1.00000 17 hdd 9.09599 osd.17 up 1.00000 1.00000 18 hdd 9.09599 osd.18 up 1.00000 1.00000 19 hdd 9.09599 osd.19 up 1.00000 1.00000 20 hdd 9.09599 osd.20 up 1.00000 1.00000 21 hdd 9.09599 osd.21 up 1.00000 1.00000 22 hdd 9.09599 osd.22 up 1.00000 1.00000 -17 90.95700 host sn-s03 29 hdd 9.09599 osd.29 up 1.00000 1.00000 30 hdd 9.09599 osd.30 up 1.00000 1.00000 31 hdd 9.09599 osd.31 up 1.00000 1.00000 32 hdd 9.09599 osd.32 up 1.00000 1.00000 33 hdd 9.09599 osd.33 up 1.00000 1.00000 34 hdd 9.09599 osd.34 up 1.00000 1.00000 35 hdd 9.09599 osd.35 up 1.00000 1.00000 36 hdd 9.09599 osd.36 up 1.00000 1.00000 37 hdd 9.09599 osd.37 up 1.00000 1.00000 38 hdd 9.09599 osd.38 up 1.00000 1.00000 -1 0.08698 root meta -3 0.02899 host sn-m01 0 hdd 0.00999 osd.0 up 1.00000 1.00000 1 hdd 0.00999 osd.1 up 1.00000 1.00000 2 hdd 0.00999 osd.2 up 1.00000 1.00000 -11 0.02899 host sn-m02 23 hdd 0.00999 osd.23 up 1.00000 1.00000 24 hdd 0.00999 osd.24 up 1.00000 1.00000 25 hdd 0.00999 osd.25 up 1.00000 1.00000 -15 0.02899 host sn-m03 26 hdd 0.00999 osd.26 up 1.00000 1.00000 27 hdd 0.00999 osd.27 up 1.00000 1.00000 28 hdd 0.00999 osd.28 up 1.00000 1.00000 OSD 7,8 are the issue OSD's sitting at 182,187 PG's where the others are all sitting at 170,171. Am I hitting the same issue? Or is there something I can do to re balance these extra PG's across the rest of the OSD better? Thanks

4 years, 2 months

Jump to page:

2024

2023

2022

2021

2020

2019

ceph-users February 2020