Hi
I would like to change the crush rule so data lands on ssd instead of hdd, can this be done on the fly and migration will just happen or do I need to do something to move data?
Jesper
Sent from myMail for iOS
I am going to attempt to answer my own question here and someone can correct me if I am wrong.
Looking at a few of the other OSDs that we have replaced over the last year or so it looks like they are mounted using tmpfs as well and that this is just a result of switching from filestore to bluestore and that this is really nothing to worry about.
Thanks,
Shain
On 9/9/20, 11:16 AM, "Shain Miley" <SMiley(a)npr.org> wrote:
Hi,
I recently added 3 new servers to Ceph cluster. These servers use the H740p mini raid card and I had to install the HWE kernel in Ubuntu 16.04 in order to get the drives recognized.
We have a 23 node cluster and normally when we add OSDs they end up mounting like this:
/dev/sde1 3.7T 2.0T 1.8T 54% /var/lib/ceph/osd/ceph-15
/dev/sdj1 3.7T 2.0T 1.7T 55% /var/lib/ceph/osd/ceph-20
/dev/sdd1 3.7T 2.1T 1.6T 58% /var/lib/ceph/osd/ceph-14
/dev/sdc1 3.7T 1.8T 1.9T 49% /var/lib/ceph/osd/ceph-13
However I noticed this morning that the 3 new servers have the OSDs mounted like this:
tmpfs 47G 28K 47G 1% /var/lib/ceph/osd/ceph-246
tmpfs 47G 28K 47G 1% /var/lib/ceph/osd/ceph-240
tmpfs 47G 28K 47G 1% /var/lib/ceph/osd/ceph-248
tmpfs 47G 28K 47G 1% /var/lib/ceph/osd/ceph-237
Is this normal for deployments going forward…or did something go wrong? These are 12TB drives but they are showing up as 47G here instead.
We are using ceph version 12.2.13 and I installed this using ceph-deply version 2.0.1.
Thanks in advance,
Shain
Shain Miley | Director of Platform and Infrastructure | Digital Media | smiley(a)npr.org
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
Hi all,
I have build testing cluster with 4 hosts, 1 SSD's and 11 HDD on each host.
Running ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable) on Ubuntu.
Because we want to save small size object, I set bluestore_min_alloc_size 8192 (it is maybe important in this case)
I have filled it through rados gw with approx billion of small objects. After tests I changed min_alloc_size back and deleted rados pools
(to emtpy whole cluster) and I was waiting till cluster deletes data from OSD's, but that destabilized the cluster. I never reached health
OK. OSD's were killed in random order. I can start them back but they will again get out from cluster with..
```
-18> 2020-09-05 22:11:19.430 7f7a3ee40700 5 prioritycache tune_memory target: 3221225472 mapped: 2064359424 unmapped: 8708096 heap:
2073067520 old mem: 1932735282 new mem: 1932735282
-17> 2020-09-05 22:11:19.430 7f7a3ee40700 5 bluestore.MempoolThread(0x555a9d0efb70) _trim_shards cache_size: 1932735282 kv_alloc:
1644167168 kv_used: 1644135504 meta_alloc: 142606336 meta_used: 143595 data_alloc: 142606336 data_used: 98304
-16> 2020-09-05 22:11:20.434 7f7a3ee40700 5 prioritycache tune_memory target: 3221225472 mapped: 2064941056 unmapped: 8126464 heap:
2073067520 old mem: 1932735282 new mem: 1932735282
-15> 2020-09-05 22:11:21.434 7f7a3ee40700 5 prioritycache tune_memory target: 3221225472 mapped: 2064359424 unmapped: 8708096 heap:
2073067520 old mem: 1932735282 new mem: 1932735282
-14> 2020-09-05 22:11:22.258 7f7a2b81f700 5 osd.42 103257 heartbeat osd_stat(store_statfs(0x1ce18290000/0x2d08c0000/0x1d180000000, data
0x23143355/0x974a0000, compress 0x0/0x0/0x0, omap 0x1f11e, meta 0x2d08a0ee2), peers
[3,4,6,7,8,11,12,13,14,16,17,18,19,21,23,24,25,27,28,29,31,32,33,34,41,43] op hist [])
-13> 2020-09-05 22:11:22.438 7f7a3ee40700 5 prioritycache tune_memory target: 3221225472 mapped: 2064359424 unmapped: 8708096 heap:
2073067520 old mem: 1932735282 new mem: 1932735282
-12> 2020-09-05 22:11:23.442 7f7a3ee40700 5 prioritycache tune_memory target: 3221225472 mapped: 2064359424 unmapped: 8708096 heap:
2073067520 old mem: 1932735282 new mem: 1932735282
-11> 2020-09-05 22:11:24.442 7f7a3ee40700 5 prioritycache tune_memory target: 3221225472 mapped: 2064285696 unmapped: 8781824 heap:
2073067520 old mem: 1932735282 new mem: 1932735282
-10> 2020-09-05 22:11:24.442 7f7a3ee40700 5 bluestore.MempoolThread(0x555a9d0efb70) _trim_shards cache_size: 1932735282 kv_alloc:
1644167168 kv_used: 1644119840 meta_alloc: 142606336 meta_used: 143595 data_alloc: 142606336 data_used: 98304
-9> 2020-09-05 22:11:24.442 7f7a2e024700 0 bluestore(/var/lib/ceph/osd/ceph-42) log_latency_fn slow operation observed for
_collection_list, latency = 151.113s, lat = 2m cid =5.47_head start #5:e2000000::::0# end #MAX# max 2147483647
-8> 2020-09-05 22:11:24.446 7f7a2e024700 1 heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7f7a2e024700' had timed out after 15
-7> 2020-09-05 22:11:24.446 7f7a2e024700 1 heartbeat_map reset_timeout 'OSD::osd_op_tp thread 0x7f7a2e024700' had suicide timed out
after 150
-6> 2020-09-05 22:11:24.446 7f7a4c2a4700 10 monclient: get_auth_request con 0x555b15d07680 auth_method 0
-5> 2020-09-05 22:11:24.446 7f7a3c494700 2 osd.42 103257 ms_handle_reset con 0x555b15963600 session 0x555a9f9d6d00
-4> 2020-09-05 22:11:24.446 7f7a3c494700 2 osd.42 103257 ms_handle_reset con 0x555b15961b00 session 0x555a9f9d7980
-3> 2020-09-05 22:11:24.446 7f7a3c494700 2 osd.42 103257 ms_handle_reset con 0x555b15963a80 session 0x555a9f9d6a80
-2> 2020-09-05 22:11:24.446 7f7a3c494700 2 osd.42 103257 ms_handle_reset con 0x555b15960480 session 0x555a9f9d6f80
-1> 2020-09-05 22:11:24.446 7f7a3c494700 3 osd.42 103257 handle_osd_map epochs [103258,103259], i have 103257, src has [83902,103259]
0> 2020-09-05 22:11:24.450 7f7a2e024700 -1 *** Caught signal (Aborted) **
```
I have approx 12 OSD's down with this error.
I decided to wipe problematic OSD's so I cannot debug it, but I'm curious what I did wrong (deleting pool with many small data?) or what to
do next time.
I did that before but not with bilion object and without bluestore_min_alloc_size change, and it worked without problems.
With regards
Jan Pekar
--
============
Ing. Jan Pekař
jan.pekar(a)imatic.cz
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz | +420326555326
============
--
I'm trying to deploy a ceph cluster with a cephadm tool. I've already successfully done all steps except adding OSDs. My testing equipment consists of three hosts. Each host has SSD storage, where OS is installed into. On that storage I created partition, which can be used as a ceph block.db. Hosts have also 2 additional HDs (spinning drives) for OSD data. On docs I couldn't find how to deploy such configuration. Do you have any hints, how to do that?
Thanks for help!
Hi
One of my clusters running nautilus 14.2.8 is very slow (13 seconds or so
where my other clusters are returning almost instantanious) when doing a
'rados --pool rc3-se.rgw.buckets.index ls' from one of the monitors.
I checked
- ceph status => OK
- routing to/from osds ok (I see a lot of established connections to osds
due to the command, nothing in syn_sent indicating incomplete handshake)
- ping times are OK
- no interface errors
- no packet drops
- no increasing send queus
- and as far as I can see nothing out of the ordinary in mon and osd logs
I have no clue how to debug the issue. If someone has pointers it would be
much appreciated
Kind Regards
Marcel
Thank you!
I know that article, but they promise 6 core use per OSD, and I got barely
over three, and all this in totally synthetic environment with no SDD to
blame (brd is more than fast and have a very consistent latency under any
kind of load).
On Thu, Sep 10, 2020, 19:39 Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote:
>
>
>
> Hi George,
>
> Very interesting and also a bit expecting result. Some messages posted
> here are already indicating that getting expensive top of the line
> hardware does not really result in any performance increase above some
> level. Vitaliy has documented something similar[1]
>
> [1]
> https://yourcmc.ru/wiki/Ceph_performance
>
>
>
> -----Original Message-----
> To: ceph-users(a)ceph.io
> Subject: [ceph-users] ceph-osd performance on ram disk
>
> I'm creating a benchmark suite for Сeph.
>
> During benchmarking of benchmark, I've checked how fast ceph-osd works.
> I decided to skip all 'SSD mess' and use brd (block ram disk, modprobe
> brd) as underlying storage. Brd itself can yield up to 2.7Mpps in fio.
> In single thread mode (iodepth=1) it can yield up to 750k IOPS. LVM over
> brd gives about 600kIOPS in single-threaded mode with iodepth=1 (16us
> latency).
>
> But, as soon as I put ceph-osd (bluestore) on it, I see something very
> odd. No matter how much parallel load I push onto this OSD, it never
> gives more than 30 kIOPS, and I can't understand where bottleneck is.
>
> CPU utilization: ~300%. There are 8 cores on my setup, so, CPU is not a
> bottleneck.
>
> Network: I've moved benchmark on the same host as OSD, so it's a
> localhost. Even counting network, it's still far away from saturation.
> 30kIOPS (4k) is about 1Gb/s, but I have 10G links. Anyway, tests are run
> on localhost, so network is irrelevant (I've checked it, traffic is on
> localhost). Test itself consumes about 70% CPU of one core, so there are
> plenty left.
>
> Replication: I've killed it (size=1, single osd in the pool).
>
> single-threaded latency: 200us, 4.8kIOPS.
> iopdeth=32: 2ms (15kIOPS).
> iodepth=16,numjobs=8: 5ms (24k IOPS)
>
> I'm running fio with 'rados' ioengine, and it looks like putting more
> workers doesn't change much, so it's not rados ioengine.
>
> As there is plenty CPU and IO left, there is only one possible place for
> bottleneck: some time-consuming single-threaded code in ceph-osd.
>
> Are there any knobs to tweak to see higher performance for ceph-osd? I'm
> pretty sure it's not any kind of leveling, GC or other 'iops-related'
> issues (brd has performance of two order of magnitude higher).
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
>
Sometimes the motherboard glitch can cause printer is in error state situation. Therefore, you can use the troubleshooting solutions that are available in the tech consultancy websites or you can call the help team and use their assistance to deal with the issue. In addition to that, you can always watch some tech trick videos on Youtube. https://www.epsonprintersupportpro.net/epson-printer-in-error-state/
The drafts are the place the message gets set aside in the occasion that you've excluded the recipient. In any case, if there's a glitch there, by then you can get it cured by using the help from the tech help destinations or you can dial the Facebook Customer Service Toll Free Number and have a conversation with the rep of the help gathering. https://www.customercare-email.com/facebook-customer-service.html
A naive ceph user asks:
I have 3 node cluster configured with 72 bluestore OSDs running on Ubuntu 20.01, Ceph Octopus 15.2.4
The cluster is configured via ceph-ansible stable-5.0.
No configuration changes have been made outside of what is generated by ceph-ansible.
I expected "ceph config dump" to show the entire configuration:
# ceph config dump
WHO MASK LEVEL OPTION VALUE RO
# echo $?
0
Is my expectation wrong or is something broken ?
If I set a value via ceph then it is reflected in the "ceph config dump" output
# ceph config set osd osd_memory_target 2147483648
# ceph config dump
WHO MASK LEVEL OPTION VALUE RO
osd basic osd_memory_target 2147483648
If my expectation is wrong then how does one view the configuration defaults and all ?
Thanks,
--
Dave Baukus