June 2020 - ceph-users - lists.ceph.io

Is there a way froce sync metadata in a multisite cluster

by 黄明友

Hi,all: the slave zone show metadata is caught up with master ; but use radosgw-admin bucket list|wc diff master and the slave zone , is not equal. how can I force sync it?

3 years, 10 months

3
3
1 0

Ceph and linux multi queue block IO layer

by Bobby

Hi all, One question please. Does Ceph uses the linux multi queue block IO layer ? BR Bobby

3 years, 10 months

1
0
0 0

RGW slowdown over time

by Glen Baars

Hello Ceph users, We are experiencing an issue with ceph 14.2.9 / RGW Beast frontend. We are seeing this across our two separate clusters. Over a few weeks the qlen and qactive are going up and not returning to zero. At some point we start seeing performance degrade and we need to reboot the services. We are viewing the queue numbers in the perfcounters_dump. In objecter_requests we aren't seeing any request ( apart from very briefly ) We can reproduce the issue by use S3 browser and setting the concurrent downloads to 100. After completing download of ~1000 files, the queue length has incremented by 2-5 and never returns back to zero. Subsequent bulk downloads increase the qlen. We have the following tunables set rgw_bucket_index_max_aio 128 rgw_dns_name <fqdn##> rgw_frontends beast ssl_port=443 ssl_certificate=<CERT##> rgw_max_chunk_size 4194304 rgw_num_rados_handles 16 rgw_thread_pool_size 500 Anyone seen this or have any idea how to further debug? Any additional tuning suggested? 350TB S3 data Glen This e-mail is intended solely for the benefit of the addressee(s) and any other named recipient. It is confidential and may contain legally privileged or confidential information. If you are not the recipient, any use, distribution, disclosure or copying of this e-mail is prohibited. The confidentiality and legal privilege attached to this communication is not waived or lost by reason of the mistaken transmission or delivery to you. If you have received this e-mail in error, please notify us immediately.

3 years, 10 months

1
0
0 0

ERROR: osd init failed: (1) Operation not permitted

by Ml Ml

Hello List, first of all: Yes - i made mistakes. Now i am trying to recover :-/ I had a healthy 3 node cluster which i wanted to convert to a single one. My goal was to reinstall a fresh 3 Node cluster and start with 2 nodes. I was able to healthy turn it from a 3 Node Cluster to a 2 Node cluster. Then the problems began. I started to change size=1 and min_size=1. Health was okay until here. Then over sudden both nodes got fenced...one node refused to boot, mons where missing, etc...to make long story short, here is where i am right now: root@node03:~ # ceph -s cluster b3be313f-d0ef-42d5-80c8-6b41380a47e3 health HEALTH_WARN 53 pgs stale 53 pgs stuck stale monmap e4: 2 mons at {0=10.15.15.3:6789/0,1=10.15.15.2:6789/0} election epoch 298, quorum 0,1 1,0 osdmap e6097: 14 osds: 9 up, 9 in pgmap v93644673: 512 pgs, 1 pools, 1193 GB data, 304 kobjects 1088 GB used, 32277 GB / 33366 GB avail 459 active+clean 53 stale+active+clean root@node03:~ # ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 32.56990 root default -2 25.35992 host node03 0 3.57999 osd.0 up 1.00000 1.00000 5 3.62999 osd.5 up 1.00000 1.00000 6 3.62999 osd.6 up 1.00000 1.00000 7 3.62999 osd.7 up 1.00000 1.00000 8 3.62999 osd.8 up 1.00000 1.00000 19 3.62999 osd.19 up 1.00000 1.00000 20 3.62999 osd.20 up 1.00000 1.00000 -3 7.20998 host node02 3 3.62999 osd.3 up 1.00000 1.00000 4 3.57999 osd.4 up 1.00000 1.00000 1 0 osd.1 down 0 1.00000 9 0 osd.9 down 0 1.00000 10 0 osd.10 down 0 1.00000 17 0 osd.17 down 0 1.00000 18 0 osd.18 down 0 1.00000 my main mistakes seemd to be: -------------------------------- ceph osd out osd.1 ceph auth del osd.1 systemctl stop ceph-osd@1 ceph osd rm 1 umount /var/lib/ceph/osd/ceph-1 ceph osd crush remove osd.1 As far as i can tell, ceph waits and needs data from that OSD.1 (which i removed) root@node03:~ # ceph health detail HEALTH_WARN 53 pgs stale; 53 pgs stuck stale pg 0.1a6 is stuck stale for 5086.552795, current state stale+active+clean, last acting [1] pg 0.142 is stuck stale for 5086.552784, current state stale+active+clean, last acting [1] pg 0.1e is stuck stale for 5086.552820, current state stale+active+clean, last acting [1] pg 0.e0 is stuck stale for 5086.552855, current state stale+active+clean, last acting [1] pg 0.1d is stuck stale for 5086.552822, current state stale+active+clean, last acting [1] pg 0.13c is stuck stale for 5086.552791, current state stale+active+clean, last acting [1] [...] SNIP [...] pg 0.e9 is stuck stale for 5086.552955, current state stale+active+clean, last acting [1] pg 0.87 is stuck stale for 5086.552939, current state stale+active+clean, last acting [1] When i try to start ODS.1 manually, i get: -------------------------------------------- 2020-02-10 18:48:26.107444 7f9ce31dd880 0 ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af), process ceph-osd, pid 10210 2020-02-10 18:48:26.134417 7f9ce31dd880 0 filestore(/var/lib/ceph/osd/ceph-1) backend xfs (magic 0x58465342) 2020-02-10 18:48:26.184202 7f9ce31dd880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is supported and appears to work 2020-02-10 18:48:26.184209 7f9ce31dd880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2020-02-10 18:48:26.184526 7f9ce31dd880 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2020-02-10 18:48:26.184585 7f9ce31dd880 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: extsize is disabled by conf 2020-02-10 18:48:26.309755 7f9ce31dd880 0 filestore(/var/lib/ceph/osd/ceph-1) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled 2020-02-10 18:48:26.633926 7f9ce31dd880 1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2020-02-10 18:48:26.642185 7f9ce31dd880 1 journal _open /var/lib/ceph/osd/ceph-1/journal fd 20: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1 2020-02-10 18:48:26.664273 7f9ce31dd880 0 <cls> cls/hello/cls_hello.cc:271: loading cls_hello 2020-02-10 18:48:26.732154 7f9ce31dd880 0 osd.1 6002 crush map has features 1107558400, adjusting msgr requires for clients 2020-02-10 18:48:26.732163 7f9ce31dd880 0 osd.1 6002 crush map has features 1107558400 was 8705, adjusting msgr requires for mons 2020-02-10 18:48:26.732167 7f9ce31dd880 0 osd.1 6002 crush map has features 1107558400, adjusting msgr requires for osds 2020-02-10 18:48:26.732179 7f9ce31dd880 0 osd.1 6002 load_pgs 2020-02-10 18:48:31.939810 7f9ce31dd880 0 osd.1 6002 load_pgs opened 53 pgs 2020-02-10 18:48:31.940546 7f9ce31dd880 -1 osd.1 6002 log_to_monitors {default=true} 2020-02-10 18:48:31.942471 7f9ce31dd880 1 journal close /var/lib/ceph/osd/ceph-1/journal 2020-02-10 18:48:31.969205 7f9ce31dd880 -1 ESC[0;31m ** ERROR: osd init failed: (1) Operation not permittedESC[0m Its mounted: /dev/sdg1 3.7T 127G 3.6T 4% /var/lib/ceph/osd/ceph-1 Is there any way i can get the OSD.1 back in? Thanks a lot, mario

3 years, 10 months

2
1
0 0

OSD Keeps crashing, stack trace attached

by Lindsay Mathieson

I have a problem with one osd (osd.5 on server lod) that keeps crashing. Often it immediately crashes on restart, but oddly a server reboot fixes that, also it alwats starts ok from the command line. Service status and journalctl don't show any useful information. There's two osd's on the server, the other osd never has a problem. Server * osd services only * 8GB Ram * Nautilus 14.2.9 * osd.5 : 1TB - crashes * osd.12 : 500GB - Fine So I ran it from the command line and copied the console dump when it crashed. Any thoughts? should I create a bug report for it? -- Lindsay

3 years, 10 months

1
1
0 0

OSD node OS upgrade strategy

by shubjero

Hi all, I have a 39 node, 1404 spinning disk Ceph Mimic cluster across 6 racks for a total of 9.1PiB raw and about 40% utilized. These storage nodes started their life on Ubuntu 14.04 and in-place upgraded to 16.04 2 years ago however I have started a project to do fresh installs of each OSD node to Ubuntu 18.04 to keep things fresh and well supported. I am reaching out to see what others might suggest in terms of strategy to get these hosts updated quicker than my current strategy. Current strategy: 1. Pick 3 nodes, drain them by modifying the crush weight 2. Fresh install 18.04 using automation tool (MAAS) + some Ansible playbooks to setup server 3. Purge OSD node worth of OSD' (this causes data to be 'misplaced' due to rack weight changing) 4. Run ceph-volume lvm batch for osd node 5. Move OSD's in to desired hosts in crush map (large rebalancing to fill back up) If anyone has suggestions on a quicker way to do this I am all ears. I am wondering if its not necessary to have to drain/fill OSD nodes at all and if this can be done with just a fresh install and not touch the OSD's however I don't know how to perform a fresh installation and then tell ceph that I have OSD's with data on them and to somehow re-register them with the cluster? Or is there a better order of operations to draining/filling without causing a high amount of objects to be misplaced due to manipulating the crush map. That being said, since our cluster is a bit older and the majority of our bluestore osd's are provisioned in the 'simple' method using a small metadata partition and the remainder as a raw partition whereas now it seems the suggested way is to use the lvm layout and tmpfs. Anyways, I'm all ears and appreciate any feedback. Jared Baker Ontario Institute for Cancer Research

3 years, 10 months

3
2
0 0

bluestore_rocksdb_options

by Seena Fallah

Hi. I found a default rocksdb option in bluestore that I can't find in facebook rocksdb. recycle_log_file_num this config if a boolean config in facebook rocksdb but in default Ceph configs the value of this is 4. Can someone tell what it means?

3 years, 10 months

2
6
0 0

meta values on nvme class OSDs

by Emre Eryilmaz

Hi, Ceph version 12.2.13. Ceph status show ‘Health_Warn’ nearfull osd(s) and pool(s). No data in this nvme class pools and nvme osd’s, completely free. But meta values is so big in nvme osds. Why does meta values on OSDs shows huge disk usage? $ ceph health detail HEALTH_WARN 1 nearfull osd(s); 6 pool(s) nearfull OSD_NEARFULL 1 nearfull osd(s) osd.49 is near full POOL_NEARFULL 6 pool(s) nearfull pool 'VmImages_NVMe_FdChassis_Rp3' is nearfull pool '.rgw.root' is nearfull pool 'default.rgw.control' is nearfull pool 'default.rgw.meta' is nearfull pool 'default.rgw.log' is nearfull pool 'default.rgw.buckets.non-ec' is nearfull -- $ ceph df --cluster ceph GLOBAL: SIZE AVAIL RAW USED %RAW USED 501TiB 354TiB 147TiB 29.42 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS VmImages_SSD_FdHost_Rp3 6 25.2TiB 33.26 50.6TiB 6990169 VmImages_NVMe_FdChassis_Rp3 8 19B 0 242GiB 2 VmImages_HDD_FdHost_Rp3 15 21.3TiB 35.33 39.0TiB 5755928 .rgw.root 16 1.77KiB 0 16.3TiB 11 default.rgw.control 17 0B 0 16.3TiB 8 default.rgw.meta 18 17.6KiB 0 16.3TiB 80 default.rgw.log 19 162B 0 16.3TiB 210 default.rgw.buckets.index 20 0B 0 50.6TiB 252 default.rgw.buckets.data 21 95.8GiB 0.24 39.0TiB 1561373 default.rgw.buckets.non-ec 22 0B 0 16.3TiB 4 default.rgw.buckets.data.ssd 25 0B 0 50.6TiB 0 $ ceph osd df tree | grep nvme ID CLASS WEIGHT REWEIGHT SIZE USE DATA OMAP META AVAIL %USE VAR PGS TYPE NAME 0 nvme 0.36378 1.00000 373GiB 313GiB 57.3MiB 25.8MiB 313GiB 59.0GiB 84.15 2.86 79 osd.0 7 nvme 0.36378 1.00000 373GiB 307GiB 57.4MiB 22.9MiB 307GiB 65.8GiB 82.32 2.80 70 osd.7 14 nvme 0.36378 1.00000 373GiB 313GiB 57.3MiB 25.9MiB 313GiB 59.3GiB 84.08 2.86 79 osd.14 21 nvme 0.36378 1.00000 373GiB 312GiB 57.4MiB 24.2MiB 312GiB 60.8GiB 83.68 2.85 78 osd.21 28 nvme 0.36378 1.00000 373GiB 316GiB 57.3MiB 29.8MiB 316GiB 56.4GiB 84.87 2.89 85 osd.28 35 nvme 0.36378 1.00000 373GiB 312GiB 57.4MiB 26.5MiB 312GiB 60.5GiB 83.75 2.85 80 osd.35 42 nvme 0.36378 1.00000 373GiB 311GiB 57.5MiB 25.5MiB 311GiB 61.9GiB 83.39 2.84 79 osd.42 49 nvme 0.36378 1.00000 373GiB 318GiB 57.3MiB 27.7MiB 318GiB 54.8GiB 85.28 2.90 90 osd.49 56 nvme 0.36378 1.00000 373GiB 303GiB 57.3MiB 20.2MiB 303GiB 69.5GiB 81.33 2.77 71 osd.56 63 nvme 0.36378 1.00000 373GiB 309GiB 57.4MiB 28.5MiB 309GiB 63.2GiB 83.02 2.82 82 osd.63 70 nvme 0.36378 1.00000 373GiB 312GiB 57.3MiB 26.2MiB 312GiB 60.5GiB 83.76 2.85 88 osd.70 77 nvme 0.36378 1.00000 373GiB 299GiB 57.4MiB 22.7MiB 299GiB 73.6GiB 80.24 2.73 69 osd.77 84 nvme 0.36378 1.00000 373GiB 312GiB 57.4MiB 31.7MiB 312GiB 60.2GiB 83.84 2.85 76 osd.84 91 nvme 0.36378 1.00000 373GiB 301GiB 57.3MiB 21.1MiB 301GiB 71.5GiB 80.80 2.75 76 osd.91 98 nvme 0.36378 1.00000 373GiB 285GiB 57.3MiB 15.7MiB 285GiB 87.6GiB 76.47 2.60 54 osd.98 105 nvme 0.36378 1.00000 373GiB 308GiB 57.3MiB 30.8MiB 308GiB 64.6GiB 82.67 2.81 86 osd.105 112 nvme 0.36378 1.00000 373GiB 264GiB 57.3MiB 26.0MiB 264GiB 108GiB 70.99 2.42 84 osd.112 119 nvme 0.36378 1.00000 373GiB 297GiB 57.3MiB 23.6MiB 297GiB 75.1GiB 79.85 2.72 74 osd.119 126 nvme 0.36378 1.00000 373GiB 292GiB 57.3MiB 22.8MiB 292GiB 80.2GiB 78.48 2.67 68 osd.126 133 nvme 0.36378 1.00000 373GiB 294GiB 57.3MiB 24.1MiB 294GiB 78.7GiB 78.87 2.68 70 osd.133

3 years, 10 months

1
0
0 0

meta values on nvme class OSDs

by Emre Eryilmaz

Hi, Ceph version 12.2.13. Ceph status show ‘Health_Warn’ nearfull osd(s) and pool(s). No data in this nvme class pools and nvme osd’s, completely free. But meta values is so big in nvme osds. Why does meta values on OSDs shows huge disk usage? $ ceph health detail HEALTH_WARN 1 nearfull osd(s); 6 pool(s) nearfull OSD_NEARFULL 1 nearfull osd(s) osd.49 is near full POOL_NEARFULL 6 pool(s) nearfull pool 'VmImages_NVMe_FdChassis_Rp3' is nearfull pool '.rgw.root' is nearfull pool 'default.rgw.control' is nearfull pool 'default.rgw.meta' is nearfull pool 'default.rgw.log' is nearfull pool 'default.rgw.buckets.non-ec' is nearfull -- $ ceph df --cluster ceph GLOBAL: SIZE AVAIL RAW USED %RAW USED 501TiB 354TiB 147TiB 29.42 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS VmImages_SSD_FdHost_Rp3 6 25.2TiB 33.26 50.6TiB 6990169 VmImages_NVMe_FdChassis_Rp3 8 19B 0 242GiB 2 VmImages_HDD_FdHost_Rp3 15 21.3TiB 35.33 39.0TiB 5755928 .rgw.root 16 1.77KiB 0 16.3TiB 11 default.rgw.control 17 0B 0 16.3TiB 8 default.rgw.meta 18 17.6KiB 0 16.3TiB 80 default.rgw.log 19 162B 0 16.3TiB 210 default.rgw.buckets.index 20 0B 0 50.6TiB 252 default.rgw.buckets.data 21 95.8GiB 0.24 39.0TiB 1561373 default.rgw.buckets.non-ec 22 0B 0 16.3TiB 4 default.rgw.buckets.data.ssd 25 0B 0 50.6TiB 0 $ ceph osd df tree | grep nvme ID CLASS WEIGHT REWEIGHT SIZE USE DATA OMAP META AVAIL %USE VAR PGS TYPE NAME 0 nvme 0.36378 1.00000 373GiB 313GiB 57.3MiB 25.8MiB 313GiB 59.0GiB 84.15 2.86 79 osd.0 7 nvme 0.36378 1.00000 373GiB 307GiB 57.4MiB 22.9MiB 307GiB 65.8GiB 82.32 2.80 70 osd.7 14 nvme 0.36378 1.00000 373GiB 313GiB 57.3MiB 25.9MiB 313GiB 59.3GiB 84.08 2.86 79 osd.14 21 nvme 0.36378 1.00000 373GiB 312GiB 57.4MiB 24.2MiB 312GiB 60.8GiB 83.68 2.85 78 osd.21 28 nvme 0.36378 1.00000 373GiB 316GiB 57.3MiB 29.8MiB 316GiB 56.4GiB 84.87 2.89 85 osd.28 35 nvme 0.36378 1.00000 373GiB 312GiB 57.4MiB 26.5MiB 312GiB 60.5GiB 83.75 2.85 80 osd.35 42 nvme 0.36378 1.00000 373GiB 311GiB 57.5MiB 25.5MiB 311GiB 61.9GiB 83.39 2.84 79 osd.42 49 nvme 0.36378 1.00000 373GiB 318GiB 57.3MiB 27.7MiB 318GiB 54.8GiB 85.28 2.90 90 osd.49 56 nvme 0.36378 1.00000 373GiB 303GiB 57.3MiB 20.2MiB 303GiB 69.5GiB 81.33 2.77 71 osd.56 63 nvme 0.36378 1.00000 373GiB 309GiB 57.4MiB 28.5MiB 309GiB 63.2GiB 83.02 2.82 82 osd.63 70 nvme 0.36378 1.00000 373GiB 312GiB 57.3MiB 26.2MiB 312GiB 60.5GiB 83.76 2.85 88 osd.70 77 nvme 0.36378 1.00000 373GiB 299GiB 57.4MiB 22.7MiB 299GiB 73.6GiB 80.24 2.73 69 osd.77 84 nvme 0.36378 1.00000 373GiB 312GiB 57.4MiB 31.7MiB 312GiB 60.2GiB 83.84 2.85 76 osd.84 91 nvme 0.36378 1.00000 373GiB 301GiB 57.3MiB 21.1MiB 301GiB 71.5GiB 80.80 2.75 76 osd.91 98 nvme 0.36378 1.00000 373GiB 285GiB 57.3MiB 15.7MiB 285GiB 87.6GiB 76.47 2.60 54 osd.98 105 nvme 0.36378 1.00000 373GiB 308GiB 57.3MiB 30.8MiB 308GiB 64.6GiB 82.67 2.81 86 osd.105 112 nvme 0.36378 1.00000 373GiB 264GiB 57.3MiB 26.0MiB 264GiB 108GiB 70.99 2.42 84 osd.112 119 nvme 0.36378 1.00000 373GiB 297GiB 57.3MiB 23.6MiB 297GiB 75.1GiB 79.85 2.72 74 osd.119 126 nvme 0.36378 1.00000 373GiB 292GiB 57.3MiB 22.8MiB 292GiB 80.2GiB 78.48 2.67 68 osd.126 133 nvme 0.36378 1.00000 373GiB 294GiB 57.3MiB 24.1MiB 294GiB 78.7GiB 78.87 2.68 70 osd.133

3 years, 10 months

1
0
0 0

Mapped RBD is too slow?

by Michal.Plsek＠seznam.cz

Good day, rw operations (randwrite 4kB and 4MB) over mapped RBD are just too slow. I am also using librbd over TGT. fio input: [global] rw=randwrite ioengine=libaio iodepth=64 size=1g direct=1 buffered=0 startdelay=5 group_reporting=1 thread=1 ramp_time=5 time_based disk_util=0 clat_percentiles=0 disable_lat=1 disable_clat=1 disable_slat=1 #numjobs=16 runtime=60 filename=/mnt/disk/test1/testfile.fio [test] name=test bs=4k stonewall fio output for TGT (librbd): test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B- 4096B, ioengine=libaio, iodepth=64 Starting 1 thread test: Laying out IO files (2 files / total 1024MiB) test: (groupid=0, jobs=1): err= 0: pid=6909: Fri Jun 19 15:26:11 2020 write: IOPS=6342, BW=24.8MiB/s (25.0MB/s)(1487MiB/60003msec) bw ( KiB/s): min= 8, max=70216, per=100.00%, avg=30441.30, stdev= 28899.02, samples=100 iops : min= 2, max=17554, avg=7610.27, stdev=7224.76, samples= 100 cpu : usr=2.18%, sys=11.08%, ctx=107852, majf=0, minf=356 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64= 115.5% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64= 0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64= 0.0% issued rwts: total=0,380583,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=24.8MiB/s (25.0MB/s), 24.8MiB/s-24.8MiB/s (25.0MB/s-25.0MB/s), io=1487MiB (1559MB), run=60003-60003msec ----------------- fio output for RBD: test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B- 4096B, ioengine=libaio, iodepth=64 Starting 1 thread test: (groupid=0, jobs=1): err= 0: pid=7372: Fri Jun 19 15:33:51 2020 write: IOPS=909, BW=3642KiB/s (3729kB/s)(214MiB/60186msec) bw ( KiB/s): min= 2792, max= 4688, per=100.00%, avg=3648.13, stdev= 399.09, samples=120 iops : min= 698, max= 1172, avg=912.01, stdev=99.75, samples=120 cpu : usr=0.78%, sys=3.08%, ctx=37108, majf=0, minf=267 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64= 110.2% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64= 0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64= 0.0% issued rwts: total=0,54732,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=64 Run status group 0 (all jobs): WRITE: bw=3642KiB/s (3729kB/s), 3642KiB/s-3642KiB/s (3729kB/s-3729kB/s), io=214MiB (224MB), run=60186-60186msec ----------------- I ran these tests from separate client server. I suspect that RBD is not working correctly in there, since I tried some fio tests on it and the result was almost the same with RBD cache set to false in ceph.conf (for example: fio -ioengine=libaio -name=test -bs=4M -iodepth=64 -numjobs=16 -rw= randwrite -direct=1 -runtime=60 -filename=/mnt/disk/test1 -size=10g). Can you give me any ideas where the problem might be, perhaps with RBD cache? Network capacity and usual things has been tested already. I will be able to provide more specs if needed. Thanks!

3 years, 10 months

1
0
0 0

2024

2023

2022

2021

2020

2019

ceph-users June 2020