March 2021 - ceph-users - lists.ceph.io

by David Galloway

This is the 19th update to the Ceph Nautilus release series. This is a hotfix release to prevent daemons from binding to loopback network interfaces. All nautilus users are advised to upgrade to this release. Notable Changes --------------- * This release fixes a regression introduced in v14.2.18 whereby in certain environments, OSDs will bind to 127.0.0.1. See https://tracker.ceph.com/issues/49938. Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at http://download.ceph.com/tarballs/ceph-14.2.19.tar.gz * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ * Release git sha1: bb796b9b5bab9463106022eef406373182465d11

3 years

3
5
0 0

Running ceph on multiple networks

by Andrei Mikhailovsky

Hello everyone, I have a small ceph cluster consisting of 4 Ubuntu 20.04 osd servers mainly serving rbd images to Cloudstack kvm cluster. The ceph version is 15.2.9. The network is done in such a way that all storage cluster is ran over infiniband qdr links (ipoib). We've got the management network for our ceph servers and kvm over ethernet (192.168.1.1/24) and the ipoib storage network 192.168.2.1/24. We are in the process of updating our cluster with new hardware and planning to scrap the infiniband connectivity altogether and replace it with 10gbit ethernet. We are also going to replace the kvm host servers too. We were hoping to have minimal or preferably no downtime in this process. I was wondering if we could run the ceph services (mon, osd, radosgw) concurrently over two networks after we've added the 10G ethernet? While the upgrades and migration taking place, we need to have the ceph running over the current ipoib 192.168.2.1/24 as well as the 10G 192.168.3.1/24. Could you please help me with this? Cheers Andrei

3 years

1
0
0 0

How's the maturity of CephFS and how's the maturity of Ceph erasure code?

by fanyuanli

Hi all, I'm a rookie in CEPH. I want to ask two questions. One is the maturity of cephfs, the file system of CEPH, and whether it is recommended for production environment. The other is the maturity of CEPH's erasure code and whether it can be used in production environment. Are the above two questions explained in the official documents? I may not see where they are. Thank you! Fred Fan

3 years

4
4
0 0

15.2.10 Dashboard incompatible with Reverse Proxy?

by Christoph Brüning

Hello everyone, we have updated one of our clusters from 15.2.9 to 15.2.10 and cannot access the dashboard any more. The dashboard is behind an nginx reverse proxy, which proxy_passes https://some.admin.hostname/ceph (note the /ceph path!) to the actual mgr daemon in the (not so) public network of the ceph cluster. In 15.2.9, there would be <base href="https://some.admin.hostname/ceph"> set via JS. This has apparently changed in 15.2.10 -- see PR https://github.com/ceph/ceph/pull/39372 -- leaving a plain <base href="/"> in the HTML header. Browsers (tested with Firefox and Chromium on Debian Buster) now try to load further JS for dashboard functionality from https://some.admin.hostname/ *without* the /ceph path, which obviously fails. Is there any way to configure the path in the <base href=""> tag, e.g. some "ceph dashboard" cli option? The ceph cluster runs non-containerized on Ubuntu 20.04, using the packages from download.ceph.com, and was deployed using ceph-ansible. Best, Christoph -- Dr. Christoph Brüning Universität Würzburg HPC & DataManagement @ ct.qmat & RZUW Am Hubland D-97074 Würzburg Tel.: +49 931 31-80499

3 years

1
0
0 0

Preferred order of operations when changing crush map and pool rules

by Thomas Hukkelberg

Hi all! We run a 1.5PB cluster with 12 hosts, 192 OSDs (mix of NVMe and HDD) and need to improve our failure domain by altering the crush rules and moving rack to pods, which would imply a lot of data movement. I wonder what would the preferred order of operations be when doing such changes to the crush map and pools? Will there be minimal data movement by moving all racks to pods at once and change pool repl rules or is the best approach to first move racks one by one to pods and then change pool replication rules from rack to pods? Anyhow I guess it's good practice to set 'norebalance' before moving hosts and unset to start the actual moving? Right now we have the following setup: root -> rack2 -> ups1 + node51 + node57 + switch21 root -> rack3 -> ups2 + node52 + node58 + switch22 root -> rack4 -> ups3 + node53 + node59 + switch23 root -> rack5 -> ups4 + node54 + node60 -- switch 21 ^^ root -> rack6 -> ups5 + node55 + node61 -- switch 22 ^^ root -> rack7 -> ups6 + node56 + node62 -- switch 23 ^^ Note that racks 5-7 are connected to same ToR switches as racks 2-4. Cluster and frontend network are in different VXLANs connected with dual 40GbE. Failure domain for 3x replicated pools are currently by rack, and after adding hosts 57-62 we realized that if one of the switches reboots or fails, replicated PGs located only on those 4 hosts will be unavailable and force pools offline. I guess the best way would instead like to organize the racks in pods like this: root -> pod1 -> rack2 -> ups1 + node51 + node57 root -> pod1 -> rack5 -> ups4 + node54 + node60 -> switch21 root -> pod2 -> rack3 -> ups2 + node52 + node58 root -> pod2 -> rack6 -> ups5 + node55 + node61 -> switch 22 root -> pod3 -> rack4 -> ups3 + node53 + node59 root -> pod3 -> rack7 -> ups6 + node56 + node62 -> switch 23 The reason for this arrangement is that we in the future plan to organize the pods in different buildings. We're running nautilus 14.2.16 and are about to upgrade to Octopus. Should we upgrade to Octopus before crush changes? Any thoughts or insight on how to achieve this with minimal data movement and risk of cluster downtime would be welcome! --thomas -- Thomas Hukkelberg thomas(a)hovedkvarteret.no

3 years

3
2
0 0

Fwd: ceph-fuse false passed X_OK check

by Alex Taylor

Hi Patrick, Any updates? Looking forward to your reply :D On Thu, Dec 17, 2020 at 11:39 AM Patrick Donnelly <pdonnell(a)redhat.com> wrote: > > On Wed, Dec 16, 2020 at 5:46 PM Alex Taylor <alexu4993(a)gmail.com> wrote: > > > > Hi Cephers, > > > > I'm using VSCode remote development with a docker server. It worked OK > > but fails to start the debugger after /root mounted by ceph-fuse. The > > log shows that the binary passes access X_OK check but cannot be > > actually executed. see: > > > > ``` > > strace_log: access("/root/.vscode-server/extensions/ms-vscode.cpptools-1.1.3/debugAdapters/OpenDebugAD7", > > X_OK) = 0 > > > > root@develop:~# ls -alh > > .vscode-server/extensions/ms-vscode.cpptools-1.1.3/debugAdapters/OpenDebugAD7 > > -rw-r--r-- 1 root root 978 Dec 10 13:06 > > .vscode-server/extensions/ms-vscode.cpptools-1.1.3/debugAdapters/OpenDebugAD7 > > ``` > > > > I also test the access syscall on ext4, xfs and even cephfs kernel > > client, all of them return -EACCES, which is expected (the extension > > will then explicitly call chmod +x). > > > > After some digging in the code, I found it is probably caused by > > https://github.com/ceph/ceph/blob/master/src/client/Client.cc#L5549-L5550. > > So here come two questions: > > 1. Is this a bug or is there any concern I missed? > > I tried reproducing it with the master branch and could not. It might > be due to an older fuse/ceph. I suggest you upgrade! > I tried the master(332a188d9b3c4eb5c5ad2720b7299913c5a772ee) as well and the issue still exists. My test program is: ``` #include <stdio.h> #include <unistd.h> int main() { int r; const char path[] = "test"; r = access(path, F_OK); printf("file exists: %d\n", r); r = access(path, X_OK); printf("file executable: %d\n", r); return 0; } ``` And the test result: ``` # local filesystem: ext4 root@f626800a6e85:~# ls -l test -rw-r--r-- 1 root root 6 Dec 19 06:13 test root@f626800a6e85:~# ./a.out file exists: 0 file executable: -1 root@f626800a6e85:~# findmnt -t fuse.ceph-fuse TARGET SOURCE FSTYPE OPTIONS /root/mnt ceph-fuse fuse.ceph-fuse rw,nosuid,nodev,relatime,user_id=0,group_id=0,allow_other root@f626800a6e85:~# cd mnt # ceph-fuse root@f626800a6e85:~/mnt# ls -l test -rw-r--r-- 1 root root 6 Dec 19 06:10 test root@f626800a6e85:~/mnt# ./a.out file exists: 0 file executable: 0 root@f626800a6e85:~/mnt# ./test bash: ./test: Permission denied ``` Again, ceph-fuse says file `test` is executable but in fact it can't be executed. The kernel version I'm testing on is: ``` root@f626800a6e85:~/mnt# uname -ar Linux f626800a6e85 4.9.0-7-amd64 #1 SMP Debian 4.9.110-1 (2018-07-05) x86_64 GNU/Linux ``` Please try the program above and make sure you're running it as root user, thank you. And if the reproduction still fails, please let me know the kernel version. > > 2. It works again with fuse_default_permissions=true, any drawbacks if > > this option is set? > > Correctness (ironically, for you) and performance. > > -- > Patrick Donnelly, Ph.D. > He / Him / His > Principal Software Engineer > Red Hat Sunnyvale, CA > GPG: 19F28A586F808C2402351B93C3301A3E258DD79D >

3 years

2
1
0 0

Rados gateway static website

by Marcel Kuiper

despite the examples that can be found on the internet I have troubles setting up a static website that serves from a S3 bucket If anyone could point me in the right direction that would be much appreciated Marcel I created an index.html in the bucket sky gm-rc3-jumphost01@ceph/s3cmd (master)$ ./s3cmd info s3://sky/index.html s3://sky/index.html (object): File size: 42046 Last mod: Tue, 30 Mar 2021 13:28:02 GMT MIME type: text/html Storage: STANDARD MD5 sum: 93acaccebb23a18da33ec4294d99ea1a SSE: none Policy: none CORS: none ACL: *anon*: READ ACL: Generic Sky Account: FULL_CONTROL And curl returns gm-rc3-jumphost01@tmp/skills$ curl https://sky.static.gm.core.local/index.html <html> <head><title>404 Not Found</title></head> <body> <h1>404 Not Found</h1> <ul> <li>Code: NoSuchWebsiteConfiguration</li> <li>BucketName: sky</li> <li>RequestId: tx0000000000000000000ba-00606327b8-cca124-rc3-gm</li> <li>HostId: cca124-rc3-gm-rc3</li> COnfig of de rados instance [client.radosgw.rc3-gm] debug_rgw = 20 ms_debug = 1 rgw_zonegroup = rc3 rgw_zone = rc3-gm rgw_enable_static_website = true rgw_enable_apis = s3website rgw expose bucket = true rgw_dns_name = gm-rc3-radosgw.gm.core.local rgw_dns_s3website_name = static.gm.core.local rgw_resolve_cname = true host = gm-rc3-s3web01 keyring = /etc/ceph/ceph.client.radosgw.rc3-gm.keyring log_file = /var/log/ceph/radosgw.log user = ceph rgw_frontends = civetweb port=443s ssl_certificate=/etc/ceph/ssl/key_cert_ca.pem DNS (from pdnsutil list-zone) *.static.gm.core.local 3600 IN CNAME gm-rc3-s3web01.gm.core.local The logs shows 2021-03-30 15:32:53.725 7ff760fcd700 2 RGWDataChangesLog::ChangesRenewThread: start 2021-03-30 15:32:58.409 7ff746798700 20 HTTP_ACCEPT=*/* 2021-03-30 15:32:58.409 7ff746798700 20 HTTP_HOST=sky.static.gm.core.local 2021-03-30 15:32:58.409 7ff746798700 20 HTTP_USER_AGENT=curl/7.58.0 2021-03-30 15:32:58.409 7ff746798700 20 HTTP_VERSION=1.1 2021-03-30 15:32:58.409 7ff746798700 20 REMOTE_ADDR=10.128.160.47 2021-03-30 15:32:58.409 7ff746798700 20 REQUEST_METHOD=GET 2021-03-30 15:32:58.409 7ff746798700 20 REQUEST_URI=/index.html 2021-03-30 15:32:58.409 7ff746798700 20 SCRIPT_URI=/index.html 2021-03-30 15:32:58.409 7ff746798700 20 SERVER_PORT=443 2021-03-30 15:32:58.409 7ff746798700 20 SERVER_PORT_SECURE=443 2021-03-30 15:32:58.409 7ff746798700 1 ====== starting new request req=0x7ff746791740 ===== 2021-03-30 15:32:58.409 7ff746798700 2 req 196 0.000s initializing for trans_id = tx0000000000000000000c4-006063288a-cca124-rc3-gm 2021-03-30 15:32:58.409 7ff746798700 10 rgw api priority: s3=-1 s3website=1 2021-03-30 15:32:58.409 7ff746798700 10 host=sky.static.gm.core.local 2021-03-30 15:32:58.409 7ff746798700 20 subdomain=sky domain=static.gm.core.local in_hosted_domain=1 in_hosted_domain_s3website=1 2021-03-30 15:32:58.409 7ff746798700 20 final domain/bucket subdomain=sky domain=static.gm.core.local in_hosted_domain=1 in_hosted_domain_s3website=1 s->info.domain=static.gm.core.local s->info.request_uri=/sky/index.html 2021-03-30 15:32:58.409 7ff746798700 20 get_handler handler=29RGWHandler_REST_Obj_S3Website 2021-03-30 15:32:58.409 7ff746798700 10 handler=29RGWHandler_REST_Obj_S3Website 2021-03-30 15:32:58.409 7ff746798700 2 req 196 0.000s getting op 0 2021-03-30 15:32:58.409 7ff746798700 10 op=28RGWGetObj_ObjStore_S3Website 2021-03-30 15:32:58.409 7ff746798700 2 req 196 0.000s s3:get_obj verifying requester 2021-03-30 15:32:58.409 7ff746798700 20 req 196 0.000s s3:get_obj rgw::auth::StrategyRegistry::s3_main_strategy_t: trying rgw::auth::s3::AWSAuthStrategy 2021-03-30 15:32:58.409 7ff746798700 20 req 196 0.000s s3:get_obj rgw::auth::s3::AWSAuthStrategy: trying rgw::auth::s3::S3AnonymousEngine 2021-03-30 15:32:58.409 7ff746798700 20 req 196 0.000s s3:get_obj rgw::auth::s3::S3AnonymousEngine granted access 2021-03-30 15:32:58.409 7ff746798700 20 req 196 0.000s s3:get_obj rgw::auth::s3::AWSAuthStrategy granted access 2021-03-30 15:32:58.409 7ff746798700 2 req 196 0.000s s3:get_obj normalizing buckets and tenants 2021-03-30 15:32:58.409 7ff746798700 10 s->object=index.html s->bucket=sky 2021-03-30 15:32:58.409 7ff746798700 2 req 196 0.000s s3:get_obj init permissions 2021-03-30 15:32:58.409 7ff746798700 15 decode_policy Read AccessControlPolicy<AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Owner><ID>sky</ID><DisplayName>Generic Sky Account</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>sky</ID><DisplayName>Generic Sky Account</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy> 2021-03-30 15:32:58.409 7ff746798700 20 get_system_obj_state: rctx=0x7ff74678f310 obj=rc3-gm.rgw.meta:users.uid:anonymous state=0x55835be11220 s->prefetch_data=0 2021-03-30 15:32:58.409 7ff746798700 10 cache get: name=rc3-gm.rgw.meta+users.uid+anonymous : hit (negative entry) 2021-03-30 15:32:58.409 7ff746798700 2 req 196 0.000s s3:get_obj recalculating target 2021-03-30 15:32:58.409 7ff746798700 10 retarget Starting retarget 2021-03-30 15:32:58.409 7ff746798700 10 RGWHandler_REST_S3Website::error_handler err_no=-2039 http_ret=404 2021-03-30 15:32:58.409 7ff746798700 20 No special error handling today! 2021-03-30 15:32:58.409 7ff746798700 20 op->ERRORHANDLER: err_no=-2039 new_err_no=-2039 2021-03-30 15:32:58.409 7ff746798700 2 req 196 0.000s s3:get_obj op status=0 2021-03-30 15:32:58.409 7ff746798700 2 req 196 0.000s s3:get_obj http status=404 2021-03-30 15:32:58.409 7ff746798700 1 ====== req done req=0x7ff746791740 op status=0 http_status=404 latency=0s ====== 2021-03-30 15:32:58.409 7ff746798700 20 process_request() returned -2039 2021-03-30 15:32:58.409 7ff746798700 1 civetweb: 0x55835b24e000: 10.128.160.47 - - [30/Mar/2021:15:32:58 +0200] "GET /index.html HTTP/1.1" 404 511 - curl/7.58.0

3 years

2
2
0 0

Upgrade from Luminous to Nautilus now one MDS with could not get service secret

by Robert LeBlanc

We just upgraded our cluster from Lumious to Nautilus and after a few days one of our MDS servers is getting: 2021-03-28 18:06:32.304 7f57c37ff700 5 mds.beacon.sun-gcs01-mds02 Sending beacon up:standby seq 16 2021-03-28 18:06:32.304 7f57c37ff700 20 mds.beacon.sun-gcs01-mds02 sender thread waiting interval 4s 2021-03-28 18:06:32.308 7f57c8809700 5 mds.beacon.sun-gcs01-mds02 received beacon reply up:standby seq 16 rtt 0.00400001 2021-03-28 18:06:36.308 7f57c37ff700 5 mds.beacon.sun-gcs01-mds02 Sending beacon up:standby seq 17 2021-03-28 18:06:36.308 7f57c37ff700 20 mds.beacon.sun-gcs01-mds02 sender thread waiting interval 4s 2021-03-28 18:06:36.308 7f57c8809700 5 mds.beacon.sun-gcs01-mds02 received beacon reply up:standby seq 17 rtt 0 2021-03-28 18:06:37.788 7f57c900a700 0 auth: could not find secret_id=34586 2021-03-28 18:06:37.788 7f57c900a700 0 cephx: verify_authorizer could not get service secret for service mds secret_id=34586 2021-03-28 18:06:37.788 7f57c6004700 5 mds.sun-gcs01-mds02 ms_handle_reset on v2:10.65.101.13:46566/0 2021-03-28 18:06:40.308 7f57c37ff700 5 mds.beacon.sun-gcs01-mds02 Sending beacon up:standby seq 18 2021-03-28 18:06:40.308 7f57c37ff700 20 mds.beacon.sun-gcs01-mds02 sender thread waiting interval 4s 2021-03-28 18:06:40.308 7f57c8809700 5 mds.beacon.sun-gcs01-mds02 received beacon reply up:standby seq 18 rtt 0 2021-03-28 18:06:44.304 7f57c37ff700 5 mds.beacon.sun-gcs01-mds02 Sending beacon up:standby seq 19 2021-03-28 18:06:44.304 7f57c37ff700 20 mds.beacon.sun-gcs01-mds02 sender thread waiting interval 4s I've tried removing the /var/lib/ceph/mds/ directory and getting the key again. I've removed the key and generated a new one, I've checked the clocks between all the nodes. From what I can tell, everything is good. We did have an issue where the monitor cluster fell over and would not boot. We reduced the monitors to a single monitor, disabled cephx, pulled it off the network and restarted the service a few times which allowed it to come up. We then expanded back to three mons and reenabled cephx and everything has been good until this. No other services seem to be suffering from this and it even appears that the MDS works okay even with these messages. We would like to figure out how to resolve this. Thank you, Robert LeBlanc ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

3 years

2
1
0 0

Re: should I increase the amount of PGs?

by Boris Behrens

One week later the ceph is still balancing. What worries me like hell is the %USE on a lot of those OSDs. Does ceph resolv this on it's own? We are currently down to 5TB space in the cluster. Rebalancing single OSDs doesn't work well and it increases the "missplaced objects". I thought about letting upmap do some rebalancing. Anyone know if this is a good idea? Or if I should bite my nails an wait as I am the headache of my life. [root@s3db1 ~]# ceph osd getmap -o om; osdmaptool om --upmap out.txt --upmap-pool eu-central-1.rgw.buckets.data --upmap-max 10; cat out.txt got osdmap epoch 321975 osdmaptool: osdmap file 'om' writing upmap command output to: out.txt checking for upmap cleanups upmap, max-count 10, max deviation 5 limiting to pools eu-central-1.rgw.buckets.data ([11]) pools eu-central-1.rgw.buckets.data prepared 10/10 changes ceph osd rm-pg-upmap-items 11.209 ceph osd rm-pg-upmap-items 11.253 ceph osd pg-upmap-items 11.7f 79 88 ceph osd pg-upmap-items 11.fc 53 31 105 78 ceph osd pg-upmap-items 11.1d8 84 50 ceph osd pg-upmap-items 11.47f 94 86 ceph osd pg-upmap-items 11.49c 44 71 ceph osd pg-upmap-items 11.553 74 50 ceph osd pg-upmap-items 11.6c3 66 63 ceph osd pg-upmap-items 11.7ad 43 50 ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 795.42548 - 795 TiB 626 TiB 587 TiB 82 GiB 1.4 TiB 170 TiB 78.64 1.00 - root default 56 hdd 7.32619 1.00000 7.3 TiB 6.4 TiB 6.4 TiB 684 MiB 16 GiB 910 GiB 87.87 1.12 129 up osd.56 67 hdd 7.27739 1.00000 7.3 TiB 6.4 TiB 6.4 TiB 582 MiB 16 GiB 865 GiB 88.40 1.12 115 up osd.67 79 hdd 3.63689 1.00000 3.6 TiB 3.2 TiB 432 GiB 1.9 GiB 0 B 432 GiB 88.40 1.12 63 up osd.79 53 hdd 7.32619 1.00000 7.3 TiB 6.5 TiB 6.4 TiB 971 MiB 22 GiB 864 GiB 88.48 1.13 114 up osd.53 51 hdd 7.27739 1.00000 7.3 TiB 6.5 TiB 6.4 TiB 734 MiB 15 GiB 837 GiB 88.77 1.13 120 up osd.51 73 hdd 14.55269 1.00000 15 TiB 13 TiB 13 TiB 1.8 GiB 39 GiB 1.6 TiB 88.97 1.13 246 up osd.73 55 hdd 7.32619 1.00000 7.3 TiB 6.5 TiB 6.5 TiB 259 MiB 15 GiB 825 GiB 89.01 1.13 118 up osd.55 70 hdd 7.27739 1.00000 7.3 TiB 6.5 TiB 6.5 TiB 291 MiB 16 GiB 787 GiB 89.44 1.14 119 up osd.70 42 hdd 3.73630 1.00000 3.7 TiB 3.4 TiB 3.3 TiB 685 MiB 8.2 GiB 374 GiB 90.23 1.15 60 up osd.42 94 hdd 3.63869 1.00000 3.6 TiB 3.3 TiB 3.3 TiB 132 MiB 7.7 GiB 345 GiB 90.75 1.15 64 up osd.94 25 hdd 3.73630 1.00000 3.7 TiB 3.4 TiB 3.3 TiB 3.2 MiB 8.1 GiB 352 GiB 90.79 1.15 53 up osd.25 31 hdd 7.32619 1.00000 7.3 TiB 6.7 TiB 6.6 TiB 223 MiB 15 GiB 690 GiB 90.80 1.15 117 up osd.31 84 hdd 7.52150 1.00000 7.5 TiB 6.8 TiB 6.6 TiB 159 MiB 16 GiB 699 GiB 90.93 1.16 121 up osd.84 82 hdd 3.63689 1.00000 3.6 TiB 3.3 TiB 332 GiB 1.0 GiB 0 B 332 GiB 91.08 1.16 59 up osd.82 89 hdd 7.52150 1.00000 7.5 TiB 6.9 TiB 6.6 TiB 400 MiB 15 GiB 670 GiB 91.29 1.16 126 up osd.89 33 hdd 3.73630 1.00000 3.7 TiB 3.4 TiB 3.3 TiB 382 MiB 8.6 GiB 327 GiB 91.46 1.16 66 up osd.33 90 hdd 7.52150 1.00000 7.5 TiB 6.9 TiB 6.6 TiB 338 MiB 15 GiB 658 GiB 91.46 1.16 112 up osd.90 105 hdd 3.63869 0.89999 3.6 TiB 3.3 TiB 3.3 TiB 206 MiB 8.1 GiB 301 GiB 91.91 1.17 56 up osd.105 66 hdd 7.27739 0.95000 7.3 TiB 6.7 TiB 6.7 TiB 322 MiB 16 GiB 548 GiB 92.64 1.18 121 up osd.66 46 hdd 7.27739 1.00000 7.3 TiB 6.8 TiB 6.7 TiB 316 MiB 16 GiB 536 GiB 92.81 1.18 119 up osd.46 Am Di., 23. März 2021 um 19:59 Uhr schrieb Boris Behrens <bb(a)kervyn.de>: > Good point. Thanks for the hint. I changed it for all OSDs from 5 to 1 > *crossing finger* > > Am Di., 23. März 2021 um 19:45 Uhr schrieb Dan van der Ster < > dan(a)vanderster.com>: > >> I see. When splitting PGs, the OSDs will increase is used space >> temporarily to make room for the new PGs. >> When going from 1024->2048 PGs, that means that half of the objects from >> each PG will be copied to a new PG, and then the previous PGs will have >> those objects deleted. >> >> Make sure osd_max_backfills is set to 1, so that not too many PGs are >> moving concurrently. >> >> >> >> On Tue, Mar 23, 2021, 7:39 PM Boris Behrens <bb(a)kervyn.de> wrote: >> >>> Thank you. >>> Currently I do not have any full OSDs (all <90%) but I keep this in mind. >>> What worries me is the ever increasing %USE metric (it went up from >>> around 72% to 75% in three hours). It looks like there is comming a lot of >>> data (there comes barely new data at the moment), but I think this might >>> have to do with my "let's try to increase the PGs to 2048". I hope that >>> ceph begins to split the old PGs into new ones and removes the old PGs. >>> >>> ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META >>> AVAIL %USE VAR PGS STATUS TYPE NAME >>> -1 795.42548 - 795 TiB 597 TiB 556 TiB 88 GiB 1.4 TiB >>> 198 TiB 75.12 1.00 - root default >>> >>> Am Di., 23. März 2021 um 19:21 Uhr schrieb Dan van der Ster < >>> dan(a)vanderster.com>: >>> >>>> While you're watching things, if an OSD is getting too close for >>>> comfort to the full ratio, you can temporarily increase it, e.g. >>>> ceph osd set-full-ratio 0.96 >>>> >>>> But don't set that too high -- you can really break an OSD if it gets >>>> 100% full (and then can't delete objects or whatever...) >>>> >>>> -- dan >>>> >>>> On Tue, Mar 23, 2021 at 7:17 PM Boris Behrens <bb(a)kervyn.de> wrote: >>>> > >>>> > Ok, then I will try to reweight the most filled OSDs to .95 and see >>>> if this helps. >>>> > >>>> > Am Di., 23. März 2021 um 19:13 Uhr schrieb Dan van der Ster < >>>> dan(a)vanderster.com>: >>>> >> >>>> >> Data goes to *all* PGs uniformly. >>>> >> Max_avail is limited by the available space on the most full OSD -- >>>> >> you should pay close attention to those and make sure they are moving >>>> >> in the right direction (decreasing!) >>>> >> >>>> >> Another point -- IMHO you should aim to get all PGs active+clean >>>> >> before you add yet another batch of new disks. While there are PGs >>>> >> backfilling, your osdmaps are accumulating on the mons and osds -- >>>> >> this itself will start to use a lot of space, and active+clean is the >>>> >> only way to trim the old maps. >>>> >> >>>> >> -- dan >>>> >> >>>> >> On Tue, Mar 23, 2021 at 7:05 PM Boris Behrens <bb(a)kervyn.de> wrote: >>>> >> > >>>> >> > So, >>>> >> > doing nothing and wait for the ceph to recover? >>>> >> > >>>> >> > In theory there should be enough disk space (more disks arriving >>>> tomorrow), but I fear that there might be an issue, when the backups get >>>> exported over night to this s3. Currently the max_avail lingers around 13TB >>>> and I hope, that the data will go to other PGs than the ones that are >>>> currently on filled OSDs. >>>> >> > >>>> >> > >>>> >> > >>>> >> > Am Di., 23. März 2021 um 18:58 Uhr schrieb Dan van der Ster < >>>> dan(a)vanderster.com>: >>>> >> >> >>>> >> >> Hi, >>>> >> >> >>>> >> >> backfill_toofull is not a bad thing when the cluster is really >>>> full >>>> >> >> like yours. You should expect some of the most full OSDs to >>>> eventually >>>> >> >> start decreasing in usage, as the PGs are moved to the new OSDs. >>>> Those >>>> >> >> backfill_toofull states should then resolve themselves as the OSD >>>> >> >> usage flattens out. >>>> >> >> Keep an eye on the usage of the backfill_full and nearfull OSDs >>>> though >>>> >> >> -- if they do eventually go above the full_ratio (95% by default), >>>> >> >> then writes to those OSDs would stop. >>>> >> >> >>>> >> >> But if on the other hand you're suffering from lots of slow ops or >>>> >> >> anything else visible to your users, then you could try to take >>>> some >>>> >> >> actions to slow down the rebalancing. Just let us know if that's >>>> the >>>> >> >> case and we can see about changing osd_max_backfills, some >>>> weights or >>>> >> >> maybe using the upmap-remapped tool. >>>> >> >> >>>> >> >> -- Dan >>>> >> >> >>>> >> >> On Tue, Mar 23, 2021 at 6:07 PM Boris Behrens <bb(a)kervyn.de> >>>> wrote: >>>> >> >> > >>>> >> >> > Ok, I should have listened to you :) >>>> >> >> > >>>> >> >> > In the last week we added more storage but the issue got worse >>>> instead. >>>> >> >> > Today I realized that the PGs were up to 90GB (bytes column in >>>> ceph pg ls said 95705749636), and the balance kept mentioning the 2048 PGs >>>> for this pool. We were at 72% utilization (ceph osd df tree, first line) >>>> for our cluster and I increased the PGs to 2048. >>>> >> >> > >>>> >> >> > Now I am in a world of trouble. >>>> >> >> > The space in the cluster went down, I am at 45% misplaced >>>> objects, and we already added 20x4TB disks just to not go down completly. >>>> >> >> > >>>> >> >> > The utilization is still going up and the overall free space in >>>> the cluster seems to go down. This is what my ceph status looks like and >>>> now I really need help to get that thing back to normal: >>>> >> >> > [root@s3db1 ~]# ceph status >>>> >> >> > cluster: >>>> >> >> > id: dca79fff-ffd0-58f4-1cff-82a2feea05f4 >>>> >> >> > health: HEALTH_WARN >>>> >> >> > 4 backfillfull osd(s) >>>> >> >> > 17 nearfull osd(s) >>>> >> >> > 37 pool(s) backfillfull >>>> >> >> > 13 large omap objects >>>> >> >> > Low space hindering backfill (add storage if this >>>> doesn't resolve itself): 570 pgs backfill_toofull >>>> >> >> > >>>> >> >> > services: >>>> >> >> > mon: 3 daemons, quorum >>>> ceph-s3-mon1,ceph-s3-mon2,ceph-s3-mon3 (age 44m) >>>> >> >> > mgr: ceph-mgr2(active, since 15m), standbys: ceph-mgr3, >>>> ceph-mgr1 >>>> >> >> > mds: 3 up:standby >>>> >> >> > osd: 110 osds: 110 up (since 28m), 110 in (since 28m); 1535 >>>> remapped pgs >>>> >> >> > rgw: 3 daemons active (eu-central-1, eu-msg-1, eu-secure-1) >>>> >> >> > >>>> >> >> > task status: >>>> >> >> > >>>> >> >> > data: >>>> >> >> > pools: 37 pools, 4032 pgs >>>> >> >> > objects: 116.23M objects, 182 TiB >>>> >> >> > usage: 589 TiB used, 206 TiB / 795 TiB avail >>>> >> >> > pgs: 160918554/348689415 objects misplaced (46.150%) >>>> >> >> > 2497 active+clean >>>> >> >> > 779 active+remapped+backfill_wait >>>> >> >> > 538 active+remapped+backfill_wait+backfill_toofull >>>> >> >> > 186 active+remapped+backfilling >>>> >> >> > 32 active+remapped+backfill_toofull >>>> >> >> > >>>> >> >> > io: >>>> >> >> > client: 27 MiB/s rd, 69 MiB/s wr, 497 op/s rd, 153 op/s wr >>>> >> >> > recovery: 1.5 GiB/s, 922 objects/s >>>> >> >> > >>>> >> >> > Am Di., 16. März 2021 um 09:34 Uhr schrieb Boris Behrens < >>>> bb(a)kervyn.de>: >>>> >> >> >> >>>> >> >> >> Hi Dan, >>>> >> >> >> >>>> >> >> >> my EC profile look very "default" to me. >>>> >> >> >> [root@s3db1 ~]# ceph osd erasure-code-profile ls >>>> >> >> >> default >>>> >> >> >> [root@s3db1 ~]# ceph osd erasure-code-profile get default >>>> >> >> >> k=2 >>>> >> >> >> m=1 >>>> >> >> >> plugin=jerasure >>>> >> >> >> technique=reed_sol_van >>>> >> >> >> >>>> >> >> >> I don't understand the ouput, but the balancing get worse over >>>> night: >>>> >> >> >> >>>> >> >> >> [root@s3db1 ~]# ceph-scripts/tools/ceph-pool-pg-distribution >>>> 11 >>>> >> >> >> Searching for PGs in pools: ['11'] >>>> >> >> >> Summary: 1024 PGs on 84 osds >>>> >> >> >> >>>> >> >> >> Num OSDs with X PGs: >>>> >> >> >> 15: 8 >>>> >> >> >> 16: 7 >>>> >> >> >> 17: 6 >>>> >> >> >> 18: 10 >>>> >> >> >> 19: 1 >>>> >> >> >> 32: 10 >>>> >> >> >> 33: 4 >>>> >> >> >> 34: 6 >>>> >> >> >> 35: 8 >>>> >> >> >> 65: 5 >>>> >> >> >> 66: 5 >>>> >> >> >> 67: 4 >>>> >> >> >> 68: 10 >>>> >> >> >> [root@s3db1 ~]# ceph-scripts/tools/ceph-pg-histogram >>>> --normalize --pool=11 >>>> >> >> >> # NumSamples = 84; Min = 4.12; Max = 5.09 >>>> >> >> >> # Mean = 4.553355; Variance = 0.052415; SD = 0.228942; Median >>>> 4.561608 >>>> >> >> >> # each ∎ represents a count of 1 >>>> >> >> >> 4.1244 - 4.2205 [ 8]: ∎∎∎∎∎∎∎∎ >>>> >> >> >> 4.2205 - 4.3166 [ 6]: ∎∎∎∎∎∎ >>>> >> >> >> 4.3166 - 4.4127 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ >>>> >> >> >> 4.4127 - 4.5087 [ 10]: ∎∎∎∎∎∎∎∎∎∎ >>>> >> >> >> 4.5087 - 4.6048 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ >>>> >> >> >> 4.6048 - 4.7009 [ 19]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ >>>> >> >> >> 4.7009 - 4.7970 [ 6]: ∎∎∎∎∎∎ >>>> >> >> >> 4.7970 - 4.8931 [ 8]: ∎∎∎∎∎∎∎∎ >>>> >> >> >> 4.8931 - 4.9892 [ 4]: ∎∎∎∎ >>>> >> >> >> 4.9892 - 5.0852 [ 1]: ∎ >>>> >> >> >> [root@s3db1 ~]# ceph osd df tree | sort -nk 17 | tail >>>> >> >> >> 14 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 724 GiB 19 GiB >>>> 0 B 724 GiB 80.56 1.07 56 up osd.14 >>>> >> >> >> 19 hdd 3.68750 1.00000 3.7 TiB 3.0 TiB 2.9 TiB 466 MiB >>>> 7.9 GiB 708 GiB 81.25 1.08 53 up osd.19 >>>> >> >> >> 4 hdd 3.63689 1.00000 3.6 TiB 3.0 TiB 698 GiB 703 MiB >>>> 0 B 698 GiB 81.27 1.08 48 up osd.4 >>>> >> >> >> 24 hdd 3.63689 1.00000 3.6 TiB 3.0 TiB 695 GiB 640 MiB >>>> 0 B 695 GiB 81.34 1.08 46 up osd.24 >>>> >> >> >> 75 hdd 3.68750 1.00000 3.7 TiB 3.0 TiB 2.9 TiB 440 MiB >>>> 8.1 GiB 704 GiB 81.35 1.08 48 up osd.75 >>>> >> >> >> 71 hdd 3.68750 1.00000 3.7 TiB 3.0 TiB 3.0 TiB 7.5 MiB >>>> 8.0 GiB 663 GiB 82.44 1.09 47 up osd.71 >>>> >> >> >> 76 hdd 3.68750 1.00000 3.7 TiB 3.1 TiB 3.0 TiB 251 MiB >>>> 9.0 GiB 617 GiB 83.65 1.11 50 up osd.76 >>>> >> >> >> 33 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB 3.0 TiB 399 MiB >>>> 8.1 GiB 618 GiB 83.85 1.11 55 up osd.33 >>>> >> >> >> 35 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB 3.0 TiB 317 MiB >>>> 8.8 GiB 617 GiB 83.87 1.11 50 up osd.35 >>>> >> >> >> 34 hdd 3.73630 1.00000 3.7 TiB 3.2 TiB 3.1 TiB 451 MiB >>>> 8.7 GiB 545 GiB 85.75 1.14 54 up osd.34 >>>> >> >> >> >>>> >> >> >> Am Mo., 15. März 2021 um 17:23 Uhr schrieb Dan van der Ster < >>>> dan(a)vanderster.com>: >>>> >> >> >>> >>>> >> >> >>> Hi, >>>> >> >> >>> >>>> >> >> >>> How wide are your EC profiles? If they are really wide, you >>>> might be >>>> >> >> >>> reaching the limits of what is physically possible. Also, I'm >>>> not sure >>>> >> >> >>> that upmap in 14.2.11 is very smart about *improving* >>>> existing upmap >>>> >> >> >>> rules for a given PG, in the case that a PG already has an >>>> upmap-items >>>> >> >> >>> entry but it would help the distribution to add more mapping >>>> pairs to >>>> >> >> >>> that entry. What this means, is that it might sometimes be >>>> useful to >>>> >> >> >>> randomly remove some upmap entries and see if the balancer >>>> does a >>>> >> >> >>> better job when it replaces them. >>>> >> >> >>> >>>> >> >> >>> But before you do that, I re-remembered that looking at the >>>> total PG >>>> >> >> >>> numbers is not useful -- you need to check the PGs per OSD >>>> for the >>>> >> >> >>> eu-central-1.rgw.buckets.data pool only. >>>> >> >> >>> >>>> >> >> >>> We have a couple tools that can help with this: >>>> >> >> >>> >>>> >> >> >>> 1. To see the PGs per OSD for a given pool: >>>> >> >> >>> >>>> https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-pool-pg-dis… >>>> >> >> >>> >>>> >> >> >>> E.g.: ./ceph-pool-pg-distribution 11 # to see the >>>> distribution of >>>> >> >> >>> your eu-central-1.rgw.buckets.data pool. >>>> >> >> >>> >>>> >> >> >>> The output looks like this on my well balanced clusters: >>>> >> >> >>> >>>> >> >> >>> # ceph-scripts/tools/ceph-pool-pg-distribution 15 >>>> >> >> >>> Searching for PGs in pools: ['15'] >>>> >> >> >>> Summary: 256 pgs on 56 osds >>>> >> >> >>> >>>> >> >> >>> Num OSDs with X PGs: >>>> >> >> >>> 13: 16 >>>> >> >> >>> 14: 40 >>>> >> >> >>> >>>> >> >> >>> You should expect a trimodal for your cluster. >>>> >> >> >>> >>>> >> >> >>> 2. You can also use another script from that repo to see the >>>> PGs per >>>> >> >> >>> OSD normalized to crush weight: >>>> >> >> >>> ceph-scripts/tools/ceph-pg-histogram --normalize --pool=15 >>>> >> >> >>> >>>> >> >> >>> This might explain what is going wrong. >>>> >> >> >>> >>>> >> >> >>> Cheers, Dan >>>> >> >> >>> >>>> >> >> >>> >>>> >> >> >>> On Mon, Mar 15, 2021 at 3:04 PM Boris Behrens <bb(a)kervyn.de> >>>> wrote: >>>> >> >> >>> > >>>> >> >> >>> > Absolutly: >>>> >> >> >>> > [root@s3db1 ~]# ceph osd df tree >>>> >> >> >>> > ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP >>>> META AVAIL %USE VAR PGS STATUS TYPE NAME >>>> >> >> >>> > -1 673.54224 - 674 TiB 496 TiB 468 TiB 97 >>>> GiB 1.2 TiB 177 TiB 73.67 1.00 - root default >>>> >> >> >>> > -2 58.30331 - 58 TiB 42 TiB 38 TiB 9.2 >>>> GiB 99 GiB 16 TiB 72.88 0.99 - host s3db1 >>>> >> >> >>> > 23 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 714 >>>> MiB 25 GiB 3.7 TiB 74.87 1.02 194 up osd.23 >>>> >> >> >>> > 69 hdd 14.55269 1.00000 15 TiB 11 TiB 11 TiB 1.6 >>>> GiB 40 GiB 3.4 TiB 76.32 1.04 199 up osd.69 >>>> >> >> >>> > 73 hdd 14.55269 1.00000 15 TiB 11 TiB 11 TiB 1.3 >>>> GiB 34 GiB 3.8 TiB 74.15 1.01 203 up osd.73 >>>> >> >> >>> > 79 hdd 3.63689 1.00000 3.6 TiB 2.4 TiB 1.3 TiB 1.8 >>>> GiB 0 B 1.3 TiB 65.44 0.89 47 up osd.79 >>>> >> >> >>> > 80 hdd 3.63689 1.00000 3.6 TiB 2.4 TiB 1.3 TiB 2.2 >>>> GiB 0 B 1.3 TiB 65.34 0.89 48 up osd.80 >>>> >> >> >>> > 81 hdd 3.63689 1.00000 3.6 TiB 2.4 TiB 1.3 TiB 1.1 >>>> GiB 0 B 1.3 TiB 65.38 0.89 47 up osd.81 >>>> >> >> >>> > 82 hdd 3.63689 1.00000 3.6 TiB 2.5 TiB 1.1 TiB 619 >>>> MiB 0 B 1.1 TiB 68.46 0.93 41 up osd.82 >>>> >> >> >>> > -11 50.94173 - 51 TiB 37 TiB 37 TiB 3.5 >>>> GiB 98 GiB 14 TiB 71.90 0.98 - host s3db10 >>>> >> >> >>> > 63 hdd 7.27739 1.00000 7.3 TiB 5.3 TiB 5.3 TiB 647 >>>> MiB 14 GiB 2.0 TiB 72.72 0.99 94 up osd.63 >>>> >> >> >>> > 64 hdd 7.27739 1.00000 7.3 TiB 5.3 TiB 5.2 TiB 668 >>>> MiB 14 GiB 2.0 TiB 72.23 0.98 93 up osd.64 >>>> >> >> >>> > 65 hdd 7.27739 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 227 >>>> MiB 14 GiB 2.1 TiB 71.16 0.97 100 up osd.65 >>>> >> >> >>> > 66 hdd 7.27739 1.00000 7.3 TiB 5.4 TiB 5.4 TiB 313 >>>> MiB 14 GiB 1.9 TiB 74.25 1.01 92 up osd.66 >>>> >> >> >>> > 67 hdd 7.27739 1.00000 7.3 TiB 5.1 TiB 5.1 TiB 584 >>>> MiB 14 GiB 2.1 TiB 70.63 0.96 96 up osd.67 >>>> >> >> >>> > 68 hdd 7.27739 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 720 >>>> MiB 14 GiB 2.1 TiB 71.72 0.97 101 up osd.68 >>>> >> >> >>> > 70 hdd 7.27739 1.00000 7.3 TiB 5.1 TiB 5.1 TiB 425 >>>> MiB 14 GiB 2.1 TiB 70.59 0.96 97 up osd.70 >>>> >> >> >>> > -12 50.99052 - 51 TiB 38 TiB 37 TiB 2.1 >>>> GiB 97 GiB 13 TiB 73.77 1.00 - host s3db11 >>>> >> >> >>> > 46 hdd 7.27739 1.00000 7.3 TiB 5.6 TiB 5.6 TiB 229 >>>> MiB 14 GiB 1.7 TiB 77.05 1.05 97 up osd.46 >>>> >> >> >>> > 47 hdd 7.27739 1.00000 7.3 TiB 5.1 TiB 5.1 TiB 159 >>>> MiB 13 GiB 2.2 TiB 70.00 0.95 89 up osd.47 >>>> >> >> >>> > 48 hdd 7.27739 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 279 >>>> MiB 14 GiB 2.1 TiB 71.82 0.97 98 up osd.48 >>>> >> >> >>> > 49 hdd 7.27739 1.00000 7.3 TiB 5.5 TiB 5.4 TiB 276 >>>> MiB 14 GiB 1.8 TiB 74.90 1.02 95 up osd.49 >>>> >> >> >>> > 50 hdd 7.27739 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 336 >>>> MiB 14 GiB 2.0 TiB 72.13 0.98 93 up osd.50 >>>> >> >> >>> > 51 hdd 7.27739 1.00000 7.3 TiB 5.7 TiB 5.6 TiB 728 >>>> MiB 15 GiB 1.6 TiB 77.76 1.06 98 up osd.51 >>>> >> >> >>> > 72 hdd 7.32619 1.00000 7.3 TiB 5.3 TiB 5.3 TiB 147 >>>> MiB 13 GiB 2.0 TiB 72.75 0.99 95 up osd.72 >>>> >> >> >>> > -37 58.55478 - 59 TiB 44 TiB 44 TiB 4.4 >>>> GiB 122 GiB 15 TiB 75.20 1.02 - host s3db12 >>>> >> >> >>> > 19 hdd 3.68750 1.00000 3.7 TiB 2.9 TiB 2.9 TiB 454 >>>> MiB 8.2 GiB 780 GiB 79.35 1.08 53 up osd.19 >>>> >> >> >>> > 71 hdd 3.68750 1.00000 3.7 TiB 3.0 TiB 2.9 TiB 7.1 >>>> MiB 8.0 GiB 734 GiB 80.56 1.09 47 up osd.71 >>>> >> >> >>> > 75 hdd 3.68750 1.00000 3.7 TiB 2.9 TiB 2.9 TiB 439 >>>> MiB 8.2 GiB 777 GiB 79.43 1.08 48 up osd.75 >>>> >> >> >>> > 76 hdd 3.68750 1.00000 3.7 TiB 3.0 TiB 3.0 TiB 241 >>>> MiB 8.9 GiB 688 GiB 81.77 1.11 50 up osd.76 >>>> >> >> >>> > 77 hdd 14.60159 1.00000 15 TiB 11 TiB 11 TiB 880 >>>> MiB 30 GiB 3.6 TiB 75.44 1.02 201 up osd.77 >>>> >> >> >>> > 78 hdd 14.60159 1.00000 15 TiB 10 TiB 10 TiB 1015 >>>> MiB 28 GiB 4.2 TiB 71.26 0.97 193 up osd.78 >>>> >> >> >>> > 83 hdd 14.60159 1.00000 15 TiB 11 TiB 11 TiB 1.4 >>>> GiB 30 GiB 3.8 TiB 73.76 1.00 203 up osd.83 >>>> >> >> >>> > -3 58.49872 - 58 TiB 42 TiB 36 TiB 8.2 >>>> GiB 89 GiB 17 TiB 71.71 0.97 - host s3db2 >>>> >> >> >>> > 1 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 3.2 >>>> GiB 37 GiB 3.7 TiB 74.58 1.01 196 up osd.1 >>>> >> >> >>> > 3 hdd 3.63689 1.00000 3.6 TiB 2.3 TiB 1.3 TiB 566 >>>> MiB 0 B 1.3 TiB 64.11 0.87 50 up osd.3 >>>> >> >> >>> > 4 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 771 GiB 695 >>>> MiB 0 B 771 GiB 79.30 1.08 48 up osd.4 >>>> >> >> >>> > 5 hdd 3.63689 1.00000 3.6 TiB 2.4 TiB 1.2 TiB 482 >>>> MiB 0 B 1.2 TiB 66.51 0.90 49 up osd.5 >>>> >> >> >>> > 6 hdd 3.63689 1.00000 3.6 TiB 2.3 TiB 1.3 TiB 1.8 >>>> GiB 0 B 1.3 TiB 64.00 0.87 42 up osd.6 >>>> >> >> >>> > 7 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 639 >>>> MiB 26 GiB 4.0 TiB 72.44 0.98 192 up osd.7 >>>> >> >> >>> > 74 hdd 14.65039 1.00000 15 TiB 10 TiB 10 TiB 907 >>>> MiB 26 GiB 4.2 TiB 71.32 0.97 193 up osd.74 >>>> >> >> >>> > -4 58.49872 - 58 TiB 43 TiB 36 TiB 34 >>>> GiB 85 GiB 16 TiB 72.69 0.99 - host s3db3 >>>> >> >> >>> > 2 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 980 >>>> MiB 26 GiB 3.8 TiB 74.36 1.01 203 up osd.2 >>>> >> >> >>> > 9 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 8.4 >>>> GiB 33 GiB 3.9 TiB 73.51 1.00 186 up osd.9 >>>> >> >> >>> > 10 hdd 14.65039 1.00000 15 TiB 10 TiB 10 TiB 650 >>>> MiB 26 GiB 4.2 TiB 71.64 0.97 201 up osd.10 >>>> >> >> >>> > 12 hdd 3.63689 1.00000 3.6 TiB 2.3 TiB 1.3 TiB 754 >>>> MiB 0 B 1.3 TiB 64.17 0.87 44 up osd.12 >>>> >> >> >>> > 13 hdd 3.63689 1.00000 3.6 TiB 2.8 TiB 813 GiB 2.4 >>>> GiB 0 B 813 GiB 78.17 1.06 58 up osd.13 >>>> >> >> >>> > 14 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 797 GiB 19 >>>> GiB 0 B 797 GiB 78.60 1.07 56 up osd.14 >>>> >> >> >>> > 15 hdd 3.63689 1.00000 3.6 TiB 2.3 TiB 1.3 TiB 2.2 >>>> GiB 0 B 1.3 TiB 63.96 0.87 41 up osd.15 >>>> >> >> >>> > -5 58.49872 - 58 TiB 43 TiB 36 TiB 6.7 >>>> GiB 97 GiB 15 TiB 74.04 1.01 - host s3db4 >>>> >> >> >>> > 11 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 940 >>>> MiB 26 GiB 4.0 TiB 72.49 0.98 196 up osd.11 >>>> >> >> >>> > 17 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1022 >>>> MiB 26 GiB 3.6 TiB 75.23 1.02 204 up osd.17 >>>> >> >> >>> > 18 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 945 >>>> MiB 45 GiB 3.8 TiB 74.16 1.01 193 up osd.18 >>>> >> >> >>> > 20 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1020 GiB 596 >>>> MiB 0 B 1020 GiB 72.62 0.99 57 up osd.20 >>>> >> >> >>> > 21 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1023 GiB 1.9 >>>> GiB 0 B 1023 GiB 72.54 0.98 41 up osd.21 >>>> >> >> >>> > 22 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1023 GiB 797 >>>> MiB 0 B 1023 GiB 72.54 0.98 53 up osd.22 >>>> >> >> >>> > 24 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 766 GiB 618 >>>> MiB 0 B 766 GiB 79.42 1.08 46 up osd.24 >>>> >> >> >>> > -6 58.89636 - 59 TiB 43 TiB 43 TiB 3.0 >>>> GiB 108 GiB 16 TiB 73.40 1.00 - host s3db5 >>>> >> >> >>> > 0 hdd 3.73630 1.00000 3.7 TiB 2.7 TiB 2.6 TiB 92 >>>> MiB 7.2 GiB 1.1 TiB 71.16 0.97 45 up osd.0 >>>> >> >> >>> > 25 hdd 3.73630 1.00000 3.7 TiB 2.7 TiB 2.6 TiB 2.4 >>>> MiB 7.3 GiB 1.1 TiB 71.23 0.97 41 up osd.25 >>>> >> >> >>> > 26 hdd 3.73630 1.00000 3.7 TiB 2.8 TiB 2.7 TiB 181 >>>> MiB 7.6 GiB 935 GiB 75.57 1.03 45 up osd.26 >>>> >> >> >>> > 27 hdd 3.73630 1.00000 3.7 TiB 2.7 TiB 2.6 TiB 5.1 >>>> MiB 7.0 GiB 1.1 TiB 71.20 0.97 47 up osd.27 >>>> >> >> >>> > 28 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 977 >>>> MiB 26 GiB 3.8 TiB 73.85 1.00 197 up osd.28 >>>> >> >> >>> > 29 hdd 14.65039 1.00000 15 TiB 11 TiB 10 TiB 872 >>>> MiB 26 GiB 4.1 TiB 71.98 0.98 196 up osd.29 >>>> >> >> >>> > 30 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 943 >>>> MiB 27 GiB 3.6 TiB 75.51 1.03 202 up osd.30 >>>> >> >> >>> > -7 58.89636 - 59 TiB 44 TiB 43 TiB 13 >>>> GiB 122 GiB 15 TiB 74.97 1.02 - host s3db6 >>>> >> >> >>> > 32 hdd 3.73630 1.00000 3.7 TiB 2.8 TiB 2.7 TiB 27 >>>> MiB 7.6 GiB 940 GiB 75.42 1.02 55 up osd.32 >>>> >> >> >>> > 33 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB 3.0 TiB 376 >>>> MiB 8.2 GiB 691 GiB 81.94 1.11 55 up osd.33 >>>> >> >> >>> > 34 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB 3.0 TiB 450 >>>> MiB 8.5 GiB 620 GiB 83.79 1.14 54 up osd.34 >>>> >> >> >>> > 35 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB 3.0 TiB 316 >>>> MiB 8.4 GiB 690 GiB 81.98 1.11 50 up osd.35 >>>> >> >> >>> > 36 hdd 14.65039 1.00000 15 TiB 11 TiB 10 TiB 489 >>>> MiB 25 GiB 4.1 TiB 71.69 0.97 208 up osd.36 >>>> >> >> >>> > 37 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 11 >>>> GiB 38 GiB 4.0 TiB 72.41 0.98 195 up osd.37 >>>> >> >> >>> > 38 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.1 >>>> GiB 26 GiB 3.7 TiB 74.88 1.02 204 up osd.38 >>>> >> >> >>> > -8 58.89636 - 59 TiB 44 TiB 43 TiB 3.8 >>>> GiB 111 GiB 15 TiB 74.16 1.01 - host s3db7 >>>> >> >> >>> > 39 hdd 3.73630 1.00000 3.7 TiB 2.8 TiB 2.7 TiB 19 >>>> MiB 7.5 GiB 936 GiB 75.54 1.03 39 up osd.39 >>>> >> >> >>> > 40 hdd 3.73630 1.00000 3.7 TiB 2.6 TiB 2.5 TiB 144 >>>> MiB 7.1 GiB 1.1 TiB 69.87 0.95 39 up osd.40 >>>> >> >> >>> > 41 hdd 3.73630 1.00000 3.7 TiB 2.7 TiB 2.7 TiB 219 >>>> MiB 7.6 GiB 1011 GiB 73.57 1.00 55 up osd.41 >>>> >> >> >>> > 42 hdd 3.73630 1.00000 3.7 TiB 2.6 TiB 2.5 TiB 593 >>>> MiB 7.1 GiB 1.1 TiB 70.02 0.95 47 up osd.42 >>>> >> >> >>> > 43 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 500 >>>> MiB 27 GiB 3.7 TiB 74.67 1.01 204 up osd.43 >>>> >> >> >>> > 44 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.1 >>>> GiB 27 GiB 3.7 TiB 74.62 1.01 193 up osd.44 >>>> >> >> >>> > 45 hdd 14.65039 1.00000 15 TiB 11 TiB 11 TiB 1.2 >>>> GiB 29 GiB 3.6 TiB 75.16 1.02 204 up osd.45 >>>> >> >> >>> > -9 51.28331 - 51 TiB 39 TiB 39 TiB 4.9 >>>> GiB 107 GiB 12 TiB 76.50 1.04 - host s3db8 >>>> >> >> >>> > 8 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.5 TiB 474 >>>> MiB 14 GiB 1.7 TiB 76.37 1.04 98 up osd.8 >>>> >> >> >>> > 16 hdd 7.32619 1.00000 7.3 TiB 5.7 TiB 5.7 TiB 783 >>>> MiB 15 GiB 1.6 TiB 78.39 1.06 100 up osd.16 >>>> >> >> >>> > 31 hdd 7.32619 1.00000 7.3 TiB 5.7 TiB 5.6 TiB 441 >>>> MiB 14 GiB 1.6 TiB 77.70 1.05 91 up osd.31 >>>> >> >> >>> > 52 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.5 TiB 939 >>>> MiB 14 GiB 1.7 TiB 76.29 1.04 102 up osd.52 >>>> >> >> >>> > 53 hdd 7.32619 1.00000 7.3 TiB 5.4 TiB 5.4 TiB 848 >>>> MiB 18 GiB 1.9 TiB 74.30 1.01 98 up osd.53 >>>> >> >> >>> > 54 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.6 TiB 1.0 >>>> GiB 16 GiB 1.7 TiB 76.99 1.05 106 up osd.54 >>>> >> >> >>> > 55 hdd 7.32619 1.00000 7.3 TiB 5.5 TiB 5.5 TiB 460 >>>> MiB 15 GiB 1.8 TiB 75.46 1.02 105 up osd.55 >>>> >> >> >>> > -10 51.28331 - 51 TiB 37 TiB 37 TiB 3.8 >>>> GiB 96 GiB 14 TiB 72.77 0.99 - host s3db9 >>>> >> >> >>> > 56 hdd 7.32619 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 846 >>>> MiB 13 GiB 2.1 TiB 71.16 0.97 104 up osd.56 >>>> >> >> >>> > 57 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.6 TiB 513 >>>> MiB 15 GiB 1.7 TiB 76.53 1.04 96 up osd.57 >>>> >> >> >>> > 58 hdd 7.32619 1.00000 7.3 TiB 5.2 TiB 5.2 TiB 604 >>>> MiB 13 GiB 2.1 TiB 71.23 0.97 98 up osd.58 >>>> >> >> >>> > 59 hdd 7.32619 1.00000 7.3 TiB 5.1 TiB 5.1 TiB 414 >>>> MiB 13 GiB 2.2 TiB 70.03 0.95 88 up osd.59 >>>> >> >> >>> > 60 hdd 7.32619 1.00000 7.3 TiB 5.5 TiB 5.5 TiB 227 >>>> MiB 14 GiB 1.8 TiB 75.54 1.03 97 up osd.60 >>>> >> >> >>> > 61 hdd 7.32619 1.00000 7.3 TiB 5.1 TiB 5.1 TiB 456 >>>> MiB 13 GiB 2.2 TiB 70.01 0.95 95 up osd.61 >>>> >> >> >>> > 62 hdd 7.32619 1.00000 7.3 TiB 5.5 TiB 5.4 TiB 843 >>>> MiB 14 GiB 1.8 TiB 74.93 1.02 110 up osd.62 >>>> >> >> >>> > TOTAL 674 TiB 496 TiB 468 TiB 97 >>>> GiB 1.2 TiB 177 TiB 73.67 >>>> >> >> >>> > MIN/MAX VAR: 0.87/1.14 STDDEV: 4.22 >>>> >> >> >>> > >>>> >> >> >>> > Am Mo., 15. März 2021 um 15:02 Uhr schrieb Dan van der Ster >>>> <dan(a)vanderster.com>: >>>> >> >> >>> >> >>>> >> >> >>> >> OK thanks. Indeed "prepared 0/10 changes" means it thinks >>>> things are balanced. >>>> >> >> >>> >> Could you again share the full ceph osd df tree? >>>> >> >> >>> >> >>>> >> >> >>> >> On Mon, Mar 15, 2021 at 2:54 PM Boris Behrens < >>>> bb(a)kervyn.de> wrote: >>>> >> >> >>> >> > >>>> >> >> >>> >> > Hi Dan, >>>> >> >> >>> >> > >>>> >> >> >>> >> > I've set the autoscaler to warn, but it actually does >>>> not warn for now. So not touching it for now. >>>> >> >> >>> >> > >>>> >> >> >>> >> > this is what the log says in minute intervals: >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/active >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/sleep_interval >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/begin_time >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/end_time >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/begin_weekday >>>> >> >> >>> >> > 2021-03-15 13:51:00.970 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/end_weekday >>>> >> >> >>> >> > 2021-03-15 13:51:00.971 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/pool_ids >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700 4 mgr[balancer] >>>> Optimize plan auto_2021-03-15_13:51:00 >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/mode >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700 4 mgr[balancer] >>>> Mode upmap, max misplaced 0.050000 >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700 4 mgr[balancer] >>>> do_upmap >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/upmap_max_iterations >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700 4 mgr get_config >>>> get_config key: mgr/balancer/upmap_max_deviation >>>> >> >> >>> >> > 2021-03-15 13:51:01.203 7f307d5fd700 4 mgr[balancer] >>>> pools ['eu-msg-1.rgw.data.root', 'eu-msg-1.rgw.buckets.non-ec', >>>> 'eu-central-1.rgw.users.keys', 'eu-central-1.rgw.gc', >>>> 'eu-central-1.rgw.buckets.data', 'eu-central-1.rgw.users.email', >>>> 'eu-msg-1.rgw.gc', 'eu-central-1.rgw.usage', 'eu-msg-1.rgw.users.keys', >>>> 'eu-central-1.rgw.buckets.index', 'rbd', 'eu-msg-1.rgw.log', >>>> 'whitespace-again-2021-03-10_2', 'eu-msg-1.rgw.buckets.index', >>>> 'eu-msg-1.rgw.meta', 'eu-central-1.rgw.log', 'default.rgw.gc', >>>> 'eu-central-1.rgw.buckets.non-ec', 'eu-msg-1.rgw.usage', >>>> 'whitespace-again-2021-03-10', 'fra-1.rgw.meta', >>>> 'eu-central-1.rgw.users.uid', 'eu-msg-1.rgw.users.email', >>>> 'fra-1.rgw.control', 'eu-msg-1.rgw.users.uid', 'eu-msg-1.rgw.control', >>>> '.rgw.root', 'eu-msg-1.rgw.buckets.data', 'default.rgw.control', >>>> 'fra-1.rgw.log', 'default.rgw.data.root', 'whitespace-again-2021-03-10_3', >>>> 'default.rgw.log', 'eu-central-1.rgw.meta', 'eu-central-1.rgw.data.root', >>>> 'default.rgw.users.uid', 'eu-central-1.rgw.control'] >>>> >> >> >>> >> > 2021-03-15 13:51:01.224 7f307d5fd700 4 mgr[balancer] >>>> prepared 0/10 changes >>>> >> >> >>> >> > >>>> >> >> >>> >> > Am Mo., 15. März 2021 um 14:15 Uhr schrieb Dan van der >>>> Ster <dan(a)vanderster.com>: >>>> >> >> >>> >> >> >>>> >> >> >>> >> >> I suggest to just disable the autoscaler until your >>>> balancing is understood. >>>> >> >> >>> >> >> >>>> >> >> >>> >> >> What does your active mgr log say (with debug_mgr 4/5), >>>> grep balancer >>>> >> >> >>> >> >> /var/log/ceph/ceph-mgr.*.log >>>> >> >> >>> >> >> >>>> >> >> >>> >> >> -- Dan >>>> >> >> >>> >> >> >>>> >> >> >>> >> >> On Mon, Mar 15, 2021 at 1:47 PM Boris Behrens < >>>> bb(a)kervyn.de> wrote: >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > Hi, >>>> >> >> >>> >> >> > this unfortunally did not solve my problem. I still >>>> have some OSDs that fill up to 85% >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > According to the logging, the autoscaler might want >>>> to add more PGs to one Bucken and reduce almost all other buckets to 32. >>>> >> >> >>> >> >> > 2021-03-15 12:19:58.825 7f307f601700 4 >>>> mgr[pg_autoscaler] Pool 'eu-central-1.rgw.buckets.data' root_id -1 using >>>> 0.705080476146 of space, bias 1.0, pg target 1974.22533321 quantized to >>>> 2048 (current 1024) >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > Why the balancing does not happen is still nebulous >>>> to me. >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > Am Sa., 13. März 2021 um 16:37 Uhr schrieb Dan van >>>> der Ster <dan(a)vanderster.com>: >>>> >> >> >>> >> >> >> >>>> >> >> >>> >> >> >> OK >>>> >> >> >>> >> >> >> Btw, you might need to fail to a new mgr... I'm not >>>> sure if the current active will read that new config. >>>> >> >> >>> >> >> >> >>>> >> >> >>> >> >> >> .. dan >>>> >> >> >>> >> >> >> >>>> >> >> >>> >> >> >> >>>> >> >> >>> >> >> >> On Sat, Mar 13, 2021, 4:36 PM Boris Behrens < >>>> bb(a)kervyn.de> wrote: >>>> >> >> >>> >> >> >>> >>>> >> >> >>> >> >> >>> Hi, >>>> >> >> >>> >> >> >>> >>>> >> >> >>> >> >> >>> ok thanks. I just changed the value and rewighted >>>> everything back to 1. Now I let it sync the weekend and check how it will >>>> be on monday. >>>> >> >> >>> >> >> >>> We tried to have the systems total storage balanced >>>> as possible. New systems will be with 8TB disks but for the exiting ones we >>>> added 16TB to offset the 4TB disks and we needed a lot of storage fast, >>>> because of a DC move. If you have any recommendations I would be happy to >>>> hear them. >>>> >> >> >>> >> >> >>> >>>> >> >> >>> >> >> >>> Cheers >>>> >> >> >>> >> >> >>> Boris >>>> >> >> >>> >> >> >>> >>>> >> >> >>> >> >> >>> Am Sa., 13. März 2021 um 16:20 Uhr schrieb Dan van >>>> der Ster <dan(a)vanderster.com>: >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> Thanks. >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> Decreasing the max deviation to 2 or 1 should help >>>> in your case. This option controls when the balancer stops trying to move >>>> PGs around -- by default it stops when the deviation from the mean is 5. >>>> Yes this is too large IMO -- all of our clusters have this set to 1. >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> And given that you have some OSDs with more than >>>> 200 PGs, you definitely shouldn't increase the num PGs. >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> But anyway with your mixed device sizes it might >>>> be challenging to make a perfectly uniform distribution. Give it a try with >>>> 1 though, and let us know how it goes. >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> .. Dan >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> >>>> >> >> >>> >> >> >>>> On Sat, Mar 13, 2021, 4:11 PM Boris Behrens < >>>> bb(a)kervyn.de> wrote: >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> Hi Dan, >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> upmap_max_deviation is default (5) in our >>>> cluster. Is 1 the recommended deviation? >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> I added the whole ceph osd df tree, (I need to >>>> remove some OSDs and readd them as bluestore with SSD, so 69, 73 and 82 are >>>> a bit off now. I also reweighted to try to get the %USE mitigated). >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> I will increase the mgr debugging to see what is >>>> the problem. >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> [root@s3db1 ~]# ceph osd df tree >>>> >> >> >>> >> >> >>>>> ID CLASS WEIGHT REWEIGHT SIZE RAW USE >>>> DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME >>>> >> >> >>> >> >> >>>>> -1 673.54224 - 659 TiB 491 TiB 464 >>>> TiB 96 GiB 1.2 TiB 168 TiB 74.57 1.00 - root default >>>> >> >> >>> >> >> >>>>> -2 58.30331 - 44 TiB 22 TiB 17 >>>> TiB 5.7 GiB 38 GiB 22 TiB 49.82 0.67 - host s3db1 >>>> >> >> >>> >> >> >>>>> 23 hdd 14.65039 1.00000 15 TiB 1.8 TiB 1.7 >>>> TiB 156 MiB 4.4 GiB 13 TiB 12.50 0.17 101 up osd.23 >>>> >> >> >>> >> >> >>>>> 69 hdd 14.55269 0 0 B 0 B >>>> 0 B 0 B 0 B 0 B 0 0 11 up osd.69 >>>> >> >> >>> >> >> >>>>> 73 hdd 14.55269 1.00000 15 TiB 10 TiB 10 >>>> TiB 6.1 MiB 33 GiB 4.2 TiB 71.15 0.95 107 up osd.73 >>>> >> >> >>> >> >> >>>>> 79 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 747 >>>> GiB 2.0 GiB 0 B 747 GiB 79.94 1.07 52 up osd.79 >>>> >> >> >>> >> >> >>>>> 80 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1.0 >>>> TiB 1.9 GiB 0 B 1.0 TiB 71.61 0.96 58 up osd.80 >>>> >> >> >>> >> >> >>>>> 81 hdd 3.63689 1.00000 3.6 TiB 2.2 TiB 1.5 >>>> TiB 1.1 GiB 0 B 1.5 TiB 60.07 0.81 55 up osd.81 >>>> >> >> >>> >> >> >>>>> 82 hdd 3.63689 1.00000 3.6 TiB 1.9 TiB 1.7 >>>> TiB 536 MiB 0 B 1.7 TiB 52.68 0.71 30 up osd.82 >>>> >> >> >>> >> >> >>>>> -11 50.94173 - 51 TiB 38 TiB 38 >>>> TiB 3.7 GiB 100 GiB 13 TiB 74.69 1.00 - host s3db10 >>>> >> >> >>> >> >> >>>>> 63 hdd 7.27739 1.00000 7.3 TiB 5.5 TiB 5.5 >>>> TiB 616 MiB 14 GiB 1.7 TiB 76.04 1.02 92 up osd.63 >>>> >> >> >>> >> >> >>>>> 64 hdd 7.27739 1.00000 7.3 TiB 5.5 TiB 5.5 >>>> TiB 820 MiB 15 GiB 1.8 TiB 75.54 1.01 101 up osd.64 >>>> >> >> >>> >> >> >>>>> 65 hdd 7.27739 1.00000 7.3 TiB 5.3 TiB 5.3 >>>> TiB 109 MiB 14 GiB 2.0 TiB 73.17 0.98 105 up osd.65 >>>> >> >> >>> >> >> >>>>> 66 hdd 7.27739 1.00000 7.3 TiB 5.8 TiB 5.8 >>>> TiB 423 MiB 15 GiB 1.4 TiB 80.38 1.08 98 up osd.66 >>>> >> >> >>> >> >> >>>>> 67 hdd 7.27739 1.00000 7.3 TiB 5.1 TiB 5.1 >>>> TiB 572 MiB 14 GiB 2.2 TiB 70.10 0.94 100 up osd.67 >>>> >> >> >>> >> >> >>>>> 68 hdd 7.27739 1.00000 7.3 TiB 5.3 TiB 5.3 >>>> TiB 630 MiB 13 GiB 2.0 TiB 72.88 0.98 107 up osd.68 >>>> >> >> >>> >> >> >>>>> 70 hdd 7.27739 1.00000 7.3 TiB 5.4 TiB 5.4 >>>> TiB 648 MiB 14 GiB 1.8 TiB 74.73 1.00 102 up osd.70 >>>> >> >> >>> >> >> >>>>> -12 50.99052 - 51 TiB 39 TiB 39 >>>> TiB 2.9 GiB 99 GiB 12 TiB 77.24 1.04 - host s3db11 >>>> >> >> >>> >> >> >>>>> 46 hdd 7.27739 1.00000 7.3 TiB 5.7 TiB 5.7 >>>> TiB 102 MiB 15 GiB 1.5 TiB 78.91 1.06 97 up osd.46 >>>> >> >> >>> >> >> >>>>> 47 hdd 7.27739 1.00000 7.3 TiB 5.2 TiB 5.2 >>>> TiB 61 MiB 13 GiB 2.1 TiB 71.47 0.96 96 up osd.47 >>>> >> >> >>> >> >> >>>>> 48 hdd 7.27739 1.00000 7.3 TiB 6.1 TiB 6.1 >>>> TiB 853 MiB 15 GiB 1.2 TiB 83.46 1.12 109 up osd.48 >>>> >> >> >>> >> >> >>>>> 49 hdd 7.27739 1.00000 7.3 TiB 5.7 TiB 5.7 >>>> TiB 708 MiB 15 GiB 1.5 TiB 78.96 1.06 98 up osd.49 >>>> >> >> >>> >> >> >>>>> 50 hdd 7.27739 1.00000 7.3 TiB 5.9 TiB 5.8 >>>> TiB 472 MiB 15 GiB 1.4 TiB 80.40 1.08 102 up osd.50 >>>> >> >> >>> >> >> >>>>> 51 hdd 7.27739 1.00000 7.3 TiB 5.9 TiB 5.9 >>>> TiB 729 MiB 15 GiB 1.3 TiB 81.70 1.10 110 up osd.51 >>>> >> >> >>> >> >> >>>>> 72 hdd 7.32619 1.00000 7.3 TiB 4.8 TiB 4.8 >>>> TiB 91 MiB 12 GiB 2.5 TiB 65.82 0.88 89 up osd.72 >>>> >> >> >>> >> >> >>>>> -37 58.55478 - 59 TiB 46 TiB 46 >>>> TiB 5.0 GiB 124 GiB 12 TiB 79.04 1.06 - host s3db12 >>>> >> >> >>> >> >> >>>>> 19 hdd 3.68750 1.00000 3.7 TiB 3.1 TiB 3.1 >>>> TiB 462 MiB 8.2 GiB 559 GiB 85.18 1.14 55 up osd.19 >>>> >> >> >>> >> >> >>>>> 71 hdd 3.68750 1.00000 3.7 TiB 2.9 TiB 2.8 >>>> TiB 3.9 MiB 7.8 GiB 825 GiB 78.14 1.05 50 up osd.71 >>>> >> >> >>> >> >> >>>>> 75 hdd 3.68750 1.00000 3.7 TiB 3.1 TiB 3.1 >>>> TiB 576 MiB 8.3 GiB 555 GiB 85.29 1.14 57 up osd.75 >>>> >> >> >>> >> >> >>>>> 76 hdd 3.68750 1.00000 3.7 TiB 3.2 TiB 3.1 >>>> TiB 239 MiB 9.3 GiB 501 GiB 86.73 1.16 50 up osd.76 >>>> >> >> >>> >> >> >>>>> 77 hdd 14.60159 1.00000 15 TiB 11 TiB 11 >>>> TiB 880 MiB 30 GiB 3.6 TiB 75.57 1.01 202 up osd.77 >>>> >> >> >>> >> >> >>>>> 78 hdd 14.60159 1.00000 15 TiB 11 TiB 11 >>>> TiB 1.0 GiB 30 GiB 3.4 TiB 76.65 1.03 196 up osd.78 >>>> >> >> >>> >> >> >>>>> 83 hdd 14.60159 1.00000 15 TiB 12 TiB 12 >>>> TiB 1.8 GiB 31 GiB 2.9 TiB 80.04 1.07 223 up osd.83 >>>> >> >> >>> >> >> >>>>> -3 58.49872 - 58 TiB 43 TiB 38 >>>> TiB 8.1 GiB 91 GiB 16 TiB 73.15 0.98 - host s3db2 >>>> >> >> >>> >> >> >>>>> 1 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 3.1 GiB 38 GiB 3.6 TiB 75.52 1.01 194 up osd.1 >>>> >> >> >>> >> >> >>>>> 3 hdd 3.63689 1.00000 3.6 TiB 2.2 TiB 1.4 >>>> TiB 418 MiB 0 B 1.4 TiB 60.94 0.82 52 up osd.3 >>>> >> >> >>> >> >> >>>>> 4 hdd 3.63689 0.89999 3.6 TiB 3.2 TiB 401 >>>> GiB 845 MiB 0 B 401 GiB 89.23 1.20 53 up osd.4 >>>> >> >> >>> >> >> >>>>> 5 hdd 3.63689 1.00000 3.6 TiB 2.3 TiB 1.3 >>>> TiB 437 MiB 0 B 1.3 TiB 62.88 0.84 51 up osd.5 >>>> >> >> >>> >> >> >>>>> 6 hdd 3.63689 1.00000 3.6 TiB 2.0 TiB 1.7 >>>> TiB 1.8 GiB 0 B 1.7 TiB 54.51 0.73 47 up osd.6 >>>> >> >> >>> >> >> >>>>> 7 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 493 MiB 26 GiB 3.8 TiB 73.90 0.99 185 up osd.7 >>>> >> >> >>> >> >> >>>>> 74 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 1.1 GiB 27 GiB 3.5 TiB 76.27 1.02 208 up osd.74 >>>> >> >> >>> >> >> >>>>> -4 58.49872 - 58 TiB 43 TiB 37 >>>> TiB 33 GiB 86 GiB 15 TiB 74.05 0.99 - host s3db3 >>>> >> >> >>> >> >> >>>>> 2 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 850 MiB 26 GiB 4.0 TiB 72.78 0.98 203 up osd.2 >>>> >> >> >>> >> >> >>>>> 9 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 8.3 GiB 33 GiB 3.6 TiB 75.62 1.01 189 up osd.9 >>>> >> >> >>> >> >> >>>>> 10 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 663 MiB 28 GiB 3.5 TiB 76.34 1.02 211 up osd.10 >>>> >> >> >>> >> >> >>>>> 12 hdd 3.63689 1.00000 3.6 TiB 2.4 TiB 1.2 >>>> TiB 633 MiB 0 B 1.2 TiB 66.22 0.89 44 up osd.12 >>>> >> >> >>> >> >> >>>>> 13 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 720 >>>> GiB 2.3 GiB 0 B 720 GiB 80.66 1.08 66 up osd.13 >>>> >> >> >>> >> >> >>>>> 14 hdd 3.63689 1.00000 3.6 TiB 3.1 TiB 552 >>>> GiB 18 GiB 0 B 552 GiB 85.18 1.14 60 up osd.14 >>>> >> >> >>> >> >> >>>>> 15 hdd 3.63689 1.00000 3.6 TiB 2.0 TiB 1.7 >>>> TiB 2.1 GiB 0 B 1.7 TiB 53.72 0.72 44 up osd.15 >>>> >> >> >>> >> >> >>>>> -5 58.49872 - 58 TiB 45 TiB 37 >>>> TiB 7.2 GiB 99 GiB 14 TiB 76.37 1.02 - host s3db4 >>>> >> >> >>> >> >> >>>>> 11 hdd 14.65039 1.00000 15 TiB 12 TiB 12 >>>> TiB 897 MiB 28 GiB 2.8 TiB 81.15 1.09 205 up osd.11 >>>> >> >> >>> >> >> >>>>> 17 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 1.2 GiB 27 GiB 3.6 TiB 75.38 1.01 211 up osd.17 >>>> >> >> >>> >> >> >>>>> 18 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 965 MiB 44 GiB 4.0 TiB 72.86 0.98 188 up osd.18 >>>> >> >> >>> >> >> >>>>> 20 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 796 >>>> GiB 529 MiB 0 B 796 GiB 78.63 1.05 66 up osd.20 >>>> >> >> >>> >> >> >>>>> 21 hdd 3.63689 1.00000 3.6 TiB 2.6 TiB 1.1 >>>> TiB 2.1 GiB 0 B 1.1 TiB 70.32 0.94 47 up osd.21 >>>> >> >> >>> >> >> >>>>> 22 hdd 3.63689 1.00000 3.6 TiB 2.9 TiB 802 >>>> GiB 882 MiB 0 B 802 GiB 78.47 1.05 58 up osd.22 >>>> >> >> >>> >> >> >>>>> 24 hdd 3.63689 1.00000 3.6 TiB 2.8 TiB 856 >>>> GiB 645 MiB 0 B 856 GiB 77.01 1.03 47 up osd.24 >>>> >> >> >>> >> >> >>>>> -6 58.89636 - 59 TiB 44 TiB 44 >>>> TiB 2.4 GiB 111 GiB 15 TiB 75.22 1.01 - host s3db5 >>>> >> >> >>> >> >> >>>>> 0 hdd 3.73630 1.00000 3.7 TiB 2.4 TiB 2.3 >>>> TiB 70 MiB 6.6 GiB 1.3 TiB 65.00 0.87 48 up osd.0 >>>> >> >> >>> >> >> >>>>> 25 hdd 3.73630 1.00000 3.7 TiB 2.4 TiB 2.3 >>>> TiB 5.3 MiB 6.6 GiB 1.4 TiB 63.86 0.86 41 up osd.25 >>>> >> >> >>> >> >> >>>>> 26 hdd 3.73630 1.00000 3.7 TiB 2.9 TiB 2.8 >>>> TiB 181 MiB 7.6 GiB 862 GiB 77.47 1.04 48 up osd.26 >>>> >> >> >>> >> >> >>>>> 27 hdd 3.73630 1.00000 3.7 TiB 2.3 TiB 2.2 >>>> TiB 7.0 MiB 6.1 GiB 1.5 TiB 61.00 0.82 48 up osd.27 >>>> >> >> >>> >> >> >>>>> 28 hdd 14.65039 1.00000 15 TiB 12 TiB 12 >>>> TiB 937 MiB 30 GiB 2.8 TiB 81.19 1.09 203 up osd.28 >>>> >> >> >>> >> >> >>>>> 29 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 536 MiB 26 GiB 3.8 TiB 73.95 0.99 200 up osd.29 >>>> >> >> >>> >> >> >>>>> 30 hdd 14.65039 1.00000 15 TiB 12 TiB 11 >>>> TiB 744 MiB 28 GiB 3.1 TiB 79.07 1.06 207 up osd.30 >>>> >> >> >>> >> >> >>>>> -7 58.89636 - 59 TiB 44 TiB 44 >>>> TiB 14 GiB 122 GiB 14 TiB 75.41 1.01 - host s3db6 >>>> >> >> >>> >> >> >>>>> 32 hdd 3.73630 1.00000 3.7 TiB 3.1 TiB 3.0 >>>> TiB 16 MiB 8.2 GiB 622 GiB 83.74 1.12 65 up osd.32 >>>> >> >> >>> >> >> >>>>> 33 hdd 3.73630 0.79999 3.7 TiB 3.0 TiB 2.9 >>>> TiB 14 MiB 8.1 GiB 740 GiB 80.67 1.08 52 up osd.33 >>>> >> >> >>> >> >> >>>>> 34 hdd 3.73630 0.79999 3.7 TiB 2.9 TiB 2.8 >>>> TiB 449 MiB 7.7 GiB 877 GiB 77.08 1.03 52 up osd.34 >>>> >> >> >>> >> >> >>>>> 35 hdd 3.73630 0.79999 3.7 TiB 2.3 TiB 2.2 >>>> TiB 133 MiB 7.0 GiB 1.4 TiB 62.18 0.83 42 up osd.35 >>>> >> >> >>> >> >> >>>>> 36 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 544 MiB 26 GiB 4.0 TiB 72.98 0.98 220 up osd.36 >>>> >> >> >>> >> >> >>>>> 37 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 11 GiB 38 GiB 3.6 TiB 75.30 1.01 200 up osd.37 >>>> >> >> >>> >> >> >>>>> 38 hdd 14.65039 1.00000 15 TiB 11 TiB 11 >>>> TiB 1.2 GiB 28 GiB 3.3 TiB 77.43 1.04 217 up osd.38 >>>> >> >> >>> >> >> >>>>> -8 58.89636 - 59 TiB 47 TiB 46 >>>> TiB 3.9 GiB 116 GiB 12 TiB 78.98 1.06 - host s3db7 >>>> >> >> >>> >> >> >>>>> 39 hdd 3.73630 1.00000 3.7 TiB 3.2 TiB 3.2 >>>> TiB 19 MiB 8.5 GiB 499 GiB 86.96 1.17 43 up osd.39 >>>> >> >> >>> >> >> >>>>> 40 hdd 3.73630 1.00000 3.7 TiB 2.6 TiB 2.5 >>>> TiB 144 MiB 7.0 GiB 1.2 TiB 68.33 0.92 39 up osd.40 >>>> >> >> >>> >> >> >>>>> 41 hdd 3.73630 1.00000 3.7 TiB 3.0 TiB 2.9 >>>> TiB 218 MiB 7.9 GiB 732 GiB 80.86 1.08 64 up osd.41 >>>> >> >> >>> >> >> >>>>> 42 hdd 3.73630 1.00000 3.7 TiB 2.5 TiB 2.4 >>>> TiB 594 MiB 7.0 GiB 1.2 TiB 67.97 0.91 50 up osd.42 >>>> >> >> >>> >> >> >>>>> 43 hdd 14.65039 1.00000 15 TiB 12 TiB 12 >>>> TiB 564 MiB 28 GiB 2.9 TiB 80.32 1.08 213 up osd.43 >>>> >> >> >>> >> >> >>>>> 44 hdd 14.65039 1.00000 15 TiB 12 TiB 11 >>>> TiB 1.3 GiB 28 GiB 3.1 TiB 78.59 1.05 198 up osd.44 >>>> >> >> >>> >> >> >>>>> 45 hdd 14.65039 1.00000 15 TiB 12 TiB 12 >>>> TiB 1.2 GiB 30 GiB 2.8 TiB 81.05 1.09 214 up osd.45 >>>> >> >> >>> >> >> >>>>> -9 51.28331 - 51 TiB 41 TiB 41 >>>> TiB 4.9 GiB 108 GiB 10 TiB 79.75 1.07 - host s3db8 >>>> >> >> >>> >> >> >>>>> 8 hdd 7.32619 1.00000 7.3 TiB 5.8 TiB 5.8 >>>> TiB 472 MiB 15 GiB 1.5 TiB 79.68 1.07 99 up osd.8 >>>> >> >> >>> >> >> >>>>> 16 hdd 7.32619 1.00000 7.3 TiB 5.9 TiB 5.8 >>>> TiB 785 MiB 15 GiB 1.4 TiB 80.25 1.08 97 up osd.16 >>>> >> >> >>> >> >> >>>>> 31 hdd 7.32619 1.00000 7.3 TiB 5.5 TiB 5.5 >>>> TiB 438 MiB 14 GiB 1.8 TiB 75.36 1.01 87 up osd.31 >>>> >> >> >>> >> >> >>>>> 52 hdd 7.32619 1.00000 7.3 TiB 5.7 TiB 5.7 >>>> TiB 844 MiB 15 GiB 1.6 TiB 78.19 1.05 113 up osd.52 >>>> >> >> >>> >> >> >>>>> 53 hdd 7.32619 1.00000 7.3 TiB 6.2 TiB 6.1 >>>> TiB 792 MiB 18 GiB 1.1 TiB 84.46 1.13 109 up osd.53 >>>> >> >> >>> >> >> >>>>> 54 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.6 >>>> TiB 959 MiB 15 GiB 1.7 TiB 76.73 1.03 115 up osd.54 >>>> >> >> >>> >> >> >>>>> 55 hdd 7.32619 1.00000 7.3 TiB 6.1 TiB 6.1 >>>> TiB 699 MiB 16 GiB 1.2 TiB 83.56 1.12 122 up osd.55 >>>> >> >> >>> >> >> >>>>> -10 51.28331 - 51 TiB 39 TiB 39 >>>> TiB 4.7 GiB 100 GiB 12 TiB 76.05 1.02 - host s3db9 >>>> >> >> >>> >> >> >>>>> 56 hdd 7.32619 1.00000 7.3 TiB 5.2 TiB 5.2 >>>> TiB 840 MiB 13 GiB 2.1 TiB 71.06 0.95 105 up osd.56 >>>> >> >> >>> >> >> >>>>> 57 hdd 7.32619 1.00000 7.3 TiB 6.1 TiB 6.0 >>>> TiB 1.0 GiB 16 GiB 1.2 TiB 83.17 1.12 102 up osd.57 >>>> >> >> >>> >> >> >>>>> 58 hdd 7.32619 1.00000 7.3 TiB 6.0 TiB 5.9 >>>> TiB 43 MiB 15 GiB 1.4 TiB 81.56 1.09 105 up osd.58 >>>> >> >> >>> >> >> >>>>> 59 hdd 7.32619 1.00000 7.3 TiB 5.9 TiB 5.9 >>>> TiB 429 MiB 15 GiB 1.4 TiB 80.64 1.08 94 up osd.59 >>>> >> >> >>> >> >> >>>>> 60 hdd 7.32619 1.00000 7.3 TiB 5.4 TiB 5.3 >>>> TiB 226 MiB 14 GiB 2.0 TiB 73.25 0.98 101 up osd.60 >>>> >> >> >>> >> >> >>>>> 61 hdd 7.32619 1.00000 7.3 TiB 4.8 TiB 4.8 >>>> TiB 1.1 GiB 12 GiB 2.5 TiB 65.84 0.88 103 up osd.61 >>>> >> >> >>> >> >> >>>>> 62 hdd 7.32619 1.00000 7.3 TiB 5.6 TiB 5.6 >>>> TiB 1.0 GiB 15 GiB 1.7 TiB 76.83 1.03 126 up osd.62 >>>> >> >> >>> >> >> >>>>> TOTAL 674 TiB 501 TiB 473 >>>> TiB 96 GiB 1.2 TiB 173 TiB 74.57 >>>> >> >> >>> >> >> >>>>> MIN/MAX VAR: 0.17/1.20 STDDEV: 10.25 >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> Am Sa., 13. März 2021 um 15:57 Uhr schrieb Dan >>>> van der Ster <dan(a)vanderster.com>: >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> No, increasing num PGs won't help substantially. >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> Can you share the entire output of ceph osd df >>>> tree ? >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> Did you already set >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> ceph config set mgr >>>> mgr/balancer/upmap_max_deviation 1 >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> ?? >>>> >> >> >>> >> >> >>>>>> And I recommend debug_mgr 4/5 so you can see >>>> some basic upmap balancer logging. >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> .. Dan >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> >>>> >> >> >>> >> >> >>>>>> On Sat, Mar 13, 2021, 3:49 PM Boris Behrens < >>>> bb(a)kervyn.de> wrote: >>>> >> >> >>> >> >> >>>>>>> >>>> >> >> >>> >> >> >>>>>>> Hello people, >>>> >> >> >>> >> >> >>>>>>> >>>> >> >> >>> >> >> >>>>>>> I am still struggeling with the balancer >>>> >> >> >>> >> >> >>>>>>> ( >>>> https://www.mail-archive.com/ceph-users@ceph.io/msg09124.html) >>>> >> >> >>> >> >> >>>>>>> Now I've read some more and might think that I >>>> do not have enough PGs. >>>> >> >> >>> >> >> >>>>>>> Currently I have 84OSDs and 1024PGs for the >>>> main pool (3008 total). I >>>> >> >> >>> >> >> >>>>>>> have the autoscaler enabled, but I doesn't tell >>>> me to increase the >>>> >> >> >>> >> >> >>>>>>> PGs. >>>> >> >> >>> >> >> >>>>>>> >>>> >> >> >>> >> >> >>>>>>> What do you think? >>>> >> >> >>> >> >> >>>>>>> >>>> >> >> >>> >> >> >>>>>>> -- >>>> >> >> >>> >> >> >>>>>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft >>>> sich diesmal abweichend >>>> >> >> >>> >> >> >>>>>>> im groÃƒ¼en Saal. >>>> >> >> >>> >> >> >>>>>>> _______________________________________________ >>>> >> >> >>> >> >> >>>>>>> ceph-users mailing list -- ceph-users(a)ceph.io >>>> >> >> >>> >> >> >>>>>>> To unsubscribe send an email to >>>> ceph-users-leave(a)ceph.io >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> >>>> >> >> >>> >> >> >>>>> -- >>>> >> >> >>> >> >> >>>>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft >>>> sich diesmal abweichend im groÃƒ¼en Saal. >>>> >> >> >>> >> >> >>> >>>> >> >> >>> >> >> >>> >>>> >> >> >>> >> >> >>> >>>> >> >> >>> >> >> >>> -- >>>> >> >> >>> >> >> >>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich >>>> diesmal abweichend im groÃƒ¼en Saal. >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > >>>> >> >> >>> >> >> > -- >>>> >> >> >>> >> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich >>>> diesmal abweichend im groÃƒ¼en Saal. >>>> >> >> >>> >> > >>>> >> >> >>> >> > >>>> >> >> >>> >> > >>>> >> >> >>> >> > -- >>>> >> >> >>> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich >>>> diesmal abweichend im groÃƒ¼en Saal. >>>> >> >> >>> > >>>> >> >> >>> > >>>> >> >> >>> > >>>> >> >> >>> > -- >>>> >> >> >>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal >>>> abweichend im groÃƒ¼en Saal. >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> -- >>>> >> >> >> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal >>>> abweichend im groÃƒ¼en Saal. >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > -- >>>> >> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal >>>> abweichend im groÃƒ¼en Saal. >>>> >> > >>>> >> > >>>> >> > >>>> >> > -- >>>> >> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal >>>> abweichend im groÃƒ¼en Saal. >>>> > >>>> > >>>> > >>>> > -- >>>> > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend >>>> im groÃƒ¼en Saal. >>>> >>> >>> >>> -- >>> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im >>> groÃƒ¼en Saal. >>> >> > > -- > Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im > groÃƒ¼en Saal. > -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal.

3 years

2
6
0 0

forceful remap PGs

by Boris Behrens

Hi, I have a couple OSDs that currently get a lot of data, and are running towards 95% fillrate. I would like to forcefully remap some PGs (they are around 100GB) to more empty OSDs and drop them from the full OSDs. I know this would lead to degraded objects, but I am not sure how long the cluster will stay in a state where it can allocate objects. OSD.105 grew from around 85% to 92% in the last 4 hours. This is the current state cluster: id: dca79fff-ffd0-58f4-1cff-82a2feea05f4 health: HEALTH_WARN noscrub,nodeep-scrub flag(s) set 9 backfillfull osd(s) 19 nearfull osd(s) 37 pool(s) backfillfull BlueFS spillover detected on 1 OSD(s) 13 large omap objects Low space hindering backfill (add storage if this doesn't resolve itself): 248 pgs backfill_toofull Degraded data redundancy: 18115/362288820 objects degraded (0.005%), 1 pg degraded, 1 pg undersized services: mon: 3 daemons, quorum ceph-s3-mon1,ceph-s3-mon2,ceph-s3-mon3 (age 6d) mgr: ceph-mgr2(active, since 6d), standbys: ceph-mgr3, ceph-mgr1 mds: 3 up:standby osd: 110 osds: 110 up (since 4d), 110 in (since 6d); 324 remapped pgs flags noscrub,nodeep-scrub rgw: 4 daemons active (admin, eu-central-1, eu-msg-1, eu-secure-1) task status: data: pools: 37 pools, 4032 pgs objects: 120.76M objects, 197 TiB usage: 620 TiB used, 176 TiB / 795 TiB avail pgs: 18115/362288820 objects degraded (0.005%) 47144186/362288820 objects misplaced (13.013%) 3708 active+clean 241 active+remapped+backfill_wait+backfill_toofull 63 active+remapped+backfill_wait 11 active+remapped+backfilling 6 active+remapped+backfill_toofull 1 active+remapped+backfilling+forced_backfill 1 active+remapped+forced_backfill+backfill_toofull 1 active+undersized+degraded+remapped+backfilling io: client: 23 MiB/s rd, 252 MiB/s wr, 347 op/s rd, 381 op/s wr recovery: 194 MiB/s, 112 objects/s --- ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 795.42548 - 795 TiB 620 TiB 582 TiB 82 GiB 1.4 TiB 176 TiB 77.90 1.00 - root default 84 hdd 7.52150 1.00000 7.5 TiB 6.8 TiB 6.5 TiB 158 MiB 15 GiB 764 GiB 90.07 1.16 121 up osd.84 79 hdd 3.63689 1.00000 3.6 TiB 3.3 TiB 367 GiB 1.9 GiB 0 B 367 GiB 90.15 1.16 64 up osd.79 70 hdd 7.27739 1.00000 7.3 TiB 6.6 TiB 6.5 TiB 268 MiB 15 GiB 730 GiB 90.20 1.16 121 up osd.70 82 hdd 3.63689 1.00000 3.6 TiB 3.3 TiB 364 GiB 1.1 GiB 0 B 364 GiB 90.23 1.16 59 up osd.82 89 hdd 7.52150 1.00000 7.5 TiB 6.8 TiB 6.6 TiB 395 MiB 16 GiB 735 GiB 90.45 1.16 126 up osd.89 90 hdd 7.52150 1.00000 7.5 TiB 6.8 TiB 6.6 TiB 338 MiB 15 GiB 723 GiB 90.62 1.16 112 up osd.90 33 hdd 3.73630 1.00000 3.7 TiB 3.4 TiB 3.3 TiB 382 MiB 8.6 GiB 358 GiB 90.64 1.16 66 up osd.33 66 hdd 7.27739 0.95000 7.3 TiB 6.7 TiB 6.7 TiB 313 MiB 16 GiB 605 GiB 91.88 1.18 122 up osd.66 46 hdd 7.27739 1.00000 7.3 TiB 6.7 TiB 6.7 TiB 312 MiB 16 GiB 601 GiB 91.93 1.18 119 up osd.46 105 hdd 3.63869 0.89999 3.6 TiB 3.4 TiB 3.4 TiB 206 MiB 8.1 GiB 281 GiB 92.45 1.19 58 up osd.105 -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groÃƒ¼en Saal.

3 years

2
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users March 2021