Hello all,
For multi-node NFS Ganesha over CephFS, is it OK to leave libcephfs write caching on, or should it be configured off for failover ?
Cheers /Maged
Hello List,
i use rbd-mirror and i asynchronously mirror to my backup cluster.
My backup cluster only has "spinnung rust" and wont be able to always
perform like the live cluster.
Thats is fine for me, as far as it´s not further behind than 12h.
vm-194-disk-1:
global_id: 7a95730f-451c-4973-8038-2a59e29ac5ad
state: up+replaying
description: replaying, master_position=[object_number=1046,
tag_tid=4, entry_tid=936210], mirror_position=[object_number=911,
tag_tid=4, entry_tid=815131], entries_behind_master=121079
last_update: 2020-03-24 08:43:43
I learned, that the entries_behind_master are single transactions. But
what i am really interested in is: How far am i behind time wise?
Is there a way to tell his?
Thanks,
Michael
RIA Institute of Technology Provides easy to understand French Language Classes in Bangalore. Learning French is considered to be a part of the curriculum for many professional jobs in major Multinational companies. We offer French Training in Marathahalli, Bangalore to aspiring Professionals and Students who are looking forward to upgrade their skills in Foreign Language.
We offer French Training Classes to Educational Segments such as First and Second PUC College Students, Working Professionals, Business Travellers and Language learning enthusiasts. Structured Course content and individual focused approach has enabled us to be one of the Best Training Institutes in Bangalore when it comes to French Language Courses. Our Trainers are Professionals in their field with relative experience.
https://www.riainstitute.co.in/French-Language-Training-in-Bangalore.html
So, I thought I’d post with what I learned re: what to do with this problem.
This system is a 3-node Proxmox cluster, and each node had:
1 x 1TB NVMe
2 x 512GB HDD
I had maybe 100GB of data in this system total. Then I added:
2 x 256GB SSD
1 x 1TB HDD
To each system, and let it start rebalancing. When it started the management interface showed the storage as being out of order in various ways, but it was clear that Ceph was rebalancing PGs across the 3 nodes and the “broken” part of the graphic display was shrinking as it spread data across the added OSDs.
In the process, however, the monitors racked up ENORMOUS amounts of files. On one machine, the boot drive only has 64GB of space total so the partition where /var/lib/ceph/somethingsomething.db lived was only 27GB. This filled up very, very fast, and eventually killed the monitor on that node. I figured out you can `ceph-monstore-tool compact` or `ceph-kvstore-tool rocksdb /path compact` to get the system to truncate the files in there, but even when I scheduled those jobs to run on each monitor every minute the amount of space being taken up by those rocksdb files grew and grew until they threatened to kill the monitors on the nodes with larger amounts of space too. Other, dumber measures I tried taking to give the system more space for these files ended up screwing up my Proxmox system, so now I have to reinstall.
What can be done about this problem so that I don’t have this issue when I try to implement again?
Thanks!
Indeed. I just had another MGR go bye-bye. I don't think host clock skew
is the problem.
On 13/03/2020 15:29, Anthony D'Atri wrote:
> Chrony does converge faster, but I doubt this will solve your problem if you don’t have quality peers. Or if it’s not really a time problem.
>
>> On Mar 13, 2020, at 6:44 AM, Janek Bevendorff <janek.bevendorff(a)uni-weimar.de> wrote:
>>
>> I replaced ntpd with chronyd and will let you know if it changes anything. Thanks.
>>
>>
>>> On 13/03/2020 06:25, Konstantin Shalygin wrote:
>>>> On 3/13/20 12:57 AM, Janek Bevendorff wrote:
>>>> NTPd is running, all the nodes have the same time to the second. I don't think that is the problem.
>>> As always in such cases - try to switch your ntpd to default EL7 daemon - chronyd.
>>>
>>>
>>>
>>> k
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
--
Bauhaus-Universität Weimar
Bauhausstr. 9a, Room 308
99423 Weimar, Germany
Phone: +49 (0)3643 - 58 3577
> I suppose the correct syntax is that anything after "client." is the
> name? So:
>
> ceph fs authorize cephfs client.bob / r / rw
>
> Would authorize a client named bob?
Yes, exactly:
admin:~ # ceph fs authorize cephfs client.bob / r / rw
[client.bob]
key = AQAyw3leAv9tKxAA+wtNEa40yK6svPE/VPlqdA==
admin:~ # mount -t ceph mon1:/ /mnt/ -o
name=bob,secret=AQAyw3leAv9tKxAA+wtNEa40yK6svPE/VPlqdA==
admin:~ # touch /mnt/file
Zitat von "Dungan, Scott A." <sdungan(a)caltech.edu>:
> That was it! I am not sure how I got confused with the client name
> syntax. When I issued the command to create a client key, I used:
>
> ceph fs authorize cephfs client.1 / r / rw
>
> I assumed from the syntax that my client name is "client.1"
>
> I suppose the correct syntax is that anything after "client." is the
> name? So:
>
> ceph fs authorize cephfs client.bob / r / rw
>
> Would authorize a client named bob?
>
> -Scott
> ________________________________
> From: Eugen Block <eblock(a)nde.ag>
> Sent: Monday, March 23, 2020 11:30 AM
> To: Dungan, Scott A. <sdungan(a)caltech.edu>
> Cc: Yan, Zheng <ukernel(a)gmail.com>; ceph-users(a)ceph.io <ceph-users(a)ceph.io>
> Subject: Re: [ceph-users] Re: Cephfs mount error 1 = Operation not permitted
>
> Wait, your client name is just "1"? In that case you need to specify
> that in your mount command:
>
> mount ... -o name=1,secret=...
>
> It has to match your ceph auth settings, where "client" is only a
> prefix and is followed by the client's name
>
> [client.1]
>
>
> Zitat von "Dungan, Scott A." <sdungan(a)caltech.edu>:
>
>> Tried that:
>>
>> [client.1]
>> key = *******************************
>> caps mds = "allow rw path=/"
>> caps mon = "allow r"
>> caps osd = "allow rw tag cephfs pool=meta_data, allow rw pool=data"
>>
>> No change.
>>
>>
>> ________________________________
>> From: Yan, Zheng <ukernel(a)gmail.com>
>> Sent: Sunday, March 22, 2020 9:28 PM
>> To: Dungan, Scott A. <sdungan(a)caltech.edu>
>> Cc: Eugen Block <eblock(a)nde.ag>; ceph-users(a)ceph.io <ceph-users(a)ceph.io>
>> Subject: Re: [ceph-users] Re: Cephfs mount error 1 = Operation not permitted
>>
>> On Sun, Mar 22, 2020 at 8:21 AM Dungan, Scott A.
>> <sdungan(a)caltech.edu> wrote:
>>>
>>> Zitat, thanks for the tips.
>>>
>>> I tried appending the key directly in the mount command
>>> (secret=<CLIENT.1.SECRET>) and that produced the same error.
>>>
>>> I took a look at the thread you suggested and I ran the commands
>>> that Paul at Croit suggested even though I the ceph dashboard
>>> showed "cephs" as already set as the application on both my data
>>> and metadata pools:
>>>
>>> [root@ceph-n4 ~]# ceph osd pool application set data cephfs data cephfs
>>> set application 'cephfs' key 'data' to 'cephfs' on pool 'data'
>>> [root@ceph-n4 ~]# ceph osd pool application set meta_data cephfs
>>> metadata cephfs
>>> set application 'cephfs' key 'metadata' to 'cephfs' on pool 'meta_data'
>>>
>>> No change. I get the "mount error 1 = Operation not permitted"
>>> error the same as before.
>>>
>>> I also tried manually editing the caps osd pool tags for my
>>> client.1, to allow rw to both the data pool as well as the metadata
>>> pool, as suggested further in the thread:
>>>
>>> [client.1]
>>> key = ***********************************
>>> caps mds = "allow rw path=all"
>>
>>
>> try replacing this with "allow rw path=/"
>>
>>> caps mon = "allow r"
>>> caps osd = "allow rw tag cephfs pool=meta_data, allow rw pool=data"
>>>
>>> No change.
>>>
>>> ________________________________
>>> From: Eugen Block <eblock(a)nde.ag>
>>> Sent: Saturday, March 21, 2020 1:16 PM
>>> To: ceph-users(a)ceph.io <ceph-users(a)ceph.io>
>>> Subject: [ceph-users] Re: Cephfs mount error 1 = Operation not permitted
>>>
>>> I just remembered there was a thread [1] about that a couple of weeks
>>> ago. Seems like you need to add the capabilities to the client.
>>>
>>> [1]
>>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/23FDDSYBCDV…
>>>
>>>
>>> Zitat von Eugen Block <eblock(a)nde.ag>:
>>>
>>> > Hi,
>>> >
>>> > have you tried to mount with the secret only instead of a secret file?
>>> >
>>> > mount -t ceph ceph-n4:6789:/ /ceph -o name=client.1,secret=<SECRET>
>>> >
>>> > If that works your secret file is not right. If not you should check
>>> > if the client actually has access to the cephfs pools ('ceph auth
>>> > list').
>>> >
>>> >
>>> >
>>> > Zitat von "Dungan, Scott A." <sdungan(a)caltech.edu>:
>>> >
>>> >> I am still very new to ceph and I have just set up my first small
>>> >> test cluster. I have Cephfs enabled (named cephfs) and everything
>>> >> is good in the dashboard. I added an authorized user key for cephfs
>>> >> with:
>>> >>
>>> >> ceph fs authorize cephfs client.1 / r / rw
>>> >>
>>> >> I then copied the key to a file with:
>>> >>
>>> >> ceph auth get-key client.1 > /tmp/client.1.secret
>>> >>
>>> >> Copied the file over to the client and then attempt mount witth the
>>> >> kernel driver:
>>> >>
>>> >> mount -t ceph ceph-n4:6789:/ /ceph -o
>>> >> name=client.1,secretfile=/root/client.1.secret
>>> >> mount error 1 = Operation not permitted
>>> >>
>>> >> I looked in the logs on the mds (which is also the mgr and mon for
>>> >> the cluster) and I don't see any events logged for this. I also
>>> >> tried the mount command with verbose and I didn't get any further
>>> >> detail. Any tips would be most appreciated.
>>> >>
>>> >> --
>>> >>
>>> >> Scott Dungan
>>> >> California Institute of Technology
>>> >> Office: (626) 395-3170
>>> >> sdungan(a)caltech.edu<mailto:sdungan@caltech.edu>
>>> >>
>>> >> _______________________________________________
>>> >> ceph-users mailing list -- ceph-users(a)ceph.io
>>> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>
>>>
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
Hi Ceph users!
We are currently configuring our new production Ceph cluster and I have
some questions regarding Ubuntu and NVMe SSDs.
Basic setup:
- Ubuntu 18.04 with HWE Kernel 5.3
- Deployment via ceph-ansible (Ceph stable "Nautilus")
- 5x Nodes with AMD EPYC 7402P CPUs
- 25Gbit/s NICs and switches for Ceph private and public network
- 4x Intel P4510 2TB NVMe SSDs (all flash) per Node
My questions:
1. Should we deploy more than one OSD per NVMe SSD? (as P4510's
performance can sustain e.g. 2 OSDs)
2. Does anyone know NVMe specific Linux settings we should enable?
3. Can we use io_uring, if yes how can we enable it? Is it enough to set
bluestore_iouring=true?
What I know so far:
Ad 1: My opinion is to use at least 2 OSDs per NVMEe SSD as the Intel
P4510 is fast enough to serve the parallel requests.
Please be aware to use the latest firmware version VDV10170 -> with
version VDV10131 we had massive stalls on Ceph side!
Ad 2: I have already enabled NVMe polling queues, Ubuntu has disabled
them by default:
Added nvme.poll_queues=1 to /etc/default/grub, then checked
/sys/block/nvme1n1/queue/io_poll
Cf.
https://lore.kernel.org/linux-block/20190318222133.GA24176@localhost.locald…
Ad 3: This commit states it should be possible to use io_uring:
https://github.com/ceph/ceph/pull/27392
This issue also shows how to set bluestore_iouring=true but it's not
clear if any more setup is required, like liburing:
https://github.com/axboe/liburing
A presentation from Christoph Hellwig shows the advantages:
https://www.snia.org/sites/default/files/SDC/2019/presentations/NVMe/Hellwi…
Any help and inputs would be appreciated,
THX - Georg
Evening,
We are running into issues exporting a disk image from ceph rbd. When we attempt to export an rbd image in a cache tiered erasure-coded pool on Luminus.
All the other disks are working fine but this one is acting up. We have a bit of important data on other disks so obviously want to make sure this doesn't happen to those.
[root@ceph-p-mon1 home]# rbd export one/one-177-588-0 one-177-588-0
Exporting image: 8% complete...rbd: error reading from source image at offset 5456789504: (5) Input/output error
2020-03-23 20:11:29.210718 7f2f3effd700 -1 librbd::io::ObjectRequest: 0x7f2f2c128f90 handle_read_object: failed to read from object: (5) Input/output error
2020-03-23 20:11:29.565184 7f2f3e7fc700 -1 librbd::io::ObjectRequest: 0x7f2f280c84d0 handle_read_cache: failed to read from cache: (5) Input/output error
Exporting image: 8% complete...failed.
rbd: export error: (5) Input/output error
Any thoughts would be appreciated.
Some info:
[root@ceph-p-mon1 home]# rbd info one/one-177-588-0
rbd image 'one-177-588-0':
size 58.6GiB in 15000 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.84a01279e2a9e3
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
create_timestamp: Fri Apr 20 17:06:09 2018
parent: one/one-177@snap
overlap: 2.20GiB
[root@ceph-p-mon1 home]# ceph status
cluster:
id: 6a2e8f21-bca2-492b-8869-eecc995216cc
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-p-mon2,ceph-p-mon1,ceph-p-mon3
mgr: ceph-p-mon2(active)
mds: cephfsec-1/1/1 up {0=ceph-p-mon2=up:active}, 6 up:standby
osd: 155 osds: 154 up, 154 in
data:
pools: 6 pools, 5904 pgs
objects: 145.53M objects, 192TiB
usage: 253TiB used, 290TiB / 543TiB avail
pgs: 5896 active+clean
8 active+clean+scrubbing+deep
io:
client: 921KiB/s rd, 5.68MiB/s wr, 110op/s rd, 29op/s wr
cache: 5.65MiB/s flush, 0op/s promote
[root@ceph-p-mon1 home]# rpm -qa | grep ceph
ceph-common-12.2.9-0.el7.x86_64
ceph-mds-12.2.9-0.el7.x86_64
ceph-radosgw-12.2.9-0.el7.x86_64
ceph-mgr-12.2.9-0.el7.x86_64
ceph-12.2.9-0.el7.x86_64
collectd-ceph-5.8.1-1.el7.x86_64
ceph-deploy-2.0.1-0.noarch
libcephfs2-12.2.9-0.el7.x86_64
python-cephfs-12.2.9-0.el7.x86_64
ceph-selinux-12.2.9-0.el7.x86_64
ceph-osd-12.2.9-0.el7.x86_64
ceph-base-12.2.9-0.el7.x86_64
ceph-mon-12.2.9-0.el7.x86_64
ceph-release-1-1.el7.noarch
Rhian Resnick
Associate Director Research Computing
Enterprise Systems
Office of Information Technology
Florida Atlantic University
777 Glades Road, CM22, Rm 173B
Boca Raton, FL 33431
Phone 561.297.2647
Fax 561.297.0222
[image] <https://hpc.fau.edu/wp-content/uploads/2015/01/image.jpg>
Hi,
I'm not able to bootstrap an OSD container for a physical device or LVM.
¿Anyone has been able to bootstrap it?
Sorry if it is not the correct place to post this question. If not, I
apologize and I will be grateful if anyone can redirect-me to the correct
place.
Thanks in advance
Oscar