March 2020 - ceph-users - lists.ceph.io

multi-node NFS Ganesha + libcephfs caching

by Maged Mokhtar

Hello all, For multi-node NFS Ganesha over CephFS, is it OK to leave libcephfs write caching on, or should it be configured off for failover ? Cheers /Maged

4 years, 1 month

3
8
0 0

rbd-mirror -> how far behind_master am i time wise?

by Ml Ml

Hello List, i use rbd-mirror and i asynchronously mirror to my backup cluster. My backup cluster only has "spinnung rust" and wont be able to always perform like the live cluster. Thats is fine for me, as far as it´s not further behind than 12h. vm-194-disk-1: global_id: 7a95730f-451c-4973-8038-2a59e29ac5ad state: up+replaying description: replaying, master_position=[object_number=1046, tag_tid=4, entry_tid=936210], mirror_position=[object_number=911, tag_tid=4, entry_tid=815131], entries_behind_master=121079 last_update: 2020-03-24 08:43:43 I learned, that the entries_behind_master are single transactions. But what i am really interested in is: How far am i behind time wise? Is there a way to tell his? Thanks, Michael

4 years, 1 month

2
1
0 0

French Classes in Bangalore

by Ria institute

RIA Institute of Technology Provides easy to understand French Language Classes in Bangalore. Learning French is considered to be a part of the curriculum for many professional jobs in major Multinational companies. We offer French Training in Marathahalli, Bangalore to aspiring Professionals and Students who are looking forward to upgrade their skills in Foreign Language. We offer French Training Classes to Educational Segments such as First and Second PUC College Students, Working Professionals, Business Travellers and Language learning enthusiasts. Structured Course content and individual focused approach has enabled us to be one of the Best Training Institutes in Bangalore when it comes to French Language Courses. Our Trainers are Professionals in their field with relative experience. https://www.riainstitute.co.in/French-Language-Training-in-Bangalore.html

4 years, 1 month

1
0
0 0

Fwd: RGW failing to create bucket

by Abhinav Singh

---------- Forwarded message --------- From: Abhinav Singh <singhabhinav0796(a)gmail.com> Date: Mon, Mar 23, 2020 at 7:43 PM Subject: RGW failing to create bucket To: <dev(a)ceph.io> ceph : octopus JaegerTracing : master ubuntu : 18.04 When I implementing jaeger tracing it is unable to create a bucket. (I m using swif to perform testing.) /src/librados/IoCtxImpl.cc ``` void librados::IoCtxImpl::queue_aio_write(AioCompletionImpl *c) { std::cout<<"yes"<<std::endl; JTracer tracer; tracer.initTracer("Writing Started", "/home/abhinav/Desktop/GSOC/deepika/ceph/src/librados/tracerConfig.yaml"); Span span=tracer.newSpan("writing started"); span->Finish(); try{ auto yaml = YAML::LoadFile("tracerConfig.yaml"); }catch(const YAML::ParserException& pe){ // ldout<<pe.what()<<dendl; std::cout<<pe.what()<<std::endl; ofstream f; f.open("/home/abhinav/Desktop/err.txt"); f<<pe.what(); f.close(); } // auto config = jaegertracing::Config::parse(yaml); // auto tracer=jaegertracing::Tracer::make( // "Writing", // config, // jaegertracing::logging::consoleLogger() // ); // opentracing::Tracer::InitGlobal( // static_pointer_cast<opentracing::Tracer>(tracer) // ); // auto span = opentracing::Tracer::Global()->StartSpan("Span1"); get(); ofstream file; file.open("/home/abhinav/Desktop/write.txt",std::ios::out | std::ios::app); file<<"Writing /src/librados/IoCtxImpl.cc 310.\n"; file.close(); std::scoped_lock l{aio_write_list_lock}; ceph_assert(c->io == this); c->aio_write_seq = ++aio_write_seq; ldout(client->cct, 20) << "queue_aio_write " << this << " completion " << c << " write_seq " << aio_write_seq << dendl; aio_write_list.push_back(&c->aio_write_list_item); // opentracing::Tracer::Global()->Close(); } ``` /include/tracer.h ``` typedef std::unique_ptr<opentracing::Span> Span; class JTracer{ public: JTracer(){} ~JTracer(){ opentracing::Tracer::Global()->Close(); } void static inline loadYamlConfigFile(const char* path){ return; } void initTracer(const char* tracerName,const char* filePath){ auto yaml = YAML::LoadFile(filePath); auto configuration = jaegertracing::Config::parse(yaml); auto tracer = jaegertracing::Tracer::make( tracerName, configuration, jaegertracing::logging::consoleLogger()); opentracing::Tracer::InitGlobal( std::static_pointer_cast<opentracing::Tracer>(tracer)); Span s=opentracing::Tracer::Global()->StartSpan("Testing"); s->Finish(); } Span newSpan(const char* spanName){ Span span=opentracing::Tracer::Global()->StartSpan(spanName); return std::move(span); } Span childSpan(const char* spanName,const Span& parentSpan){ Span span = opentracing::Tracer::Global()->StartSpan(spanName, {opentracing ::ChildOf(&parentSpan->context())}); return std::move(span); } Span followUpSpan(const char *spanName, const Span& parentSpan){ Span span = opentracing::Tracer::Global()->StartSpan(spanName, {opentracing ::FollowsFrom(&parentSpan->context())}); return std::move(span); } }; ``` Output when trying to create new container ``` errno 111 connection refused ``` But when I remove the tracer part in IoCtxImpl.cc it is workng fine. I m new to ceph, and dont know what information to share to correctly track down the problem, if any extra informtion is needed I will share it instantly. Been stuck into this issue for one week. Please someone help me! Thank you.

4 years, 1 month

2
3
0 0

Re: Newbie to Ceph jacked up his monitor

by Jarett DeAngelis

So, I thought I’d post with what I learned re: what to do with this problem. This system is a 3-node Proxmox cluster, and each node had: 1 x 1TB NVMe 2 x 512GB HDD I had maybe 100GB of data in this system total. Then I added: 2 x 256GB SSD 1 x 1TB HDD To each system, and let it start rebalancing. When it started the management interface showed the storage as being out of order in various ways, but it was clear that Ceph was rebalancing PGs across the 3 nodes and the “broken” part of the graphic display was shrinking as it spread data across the added OSDs. In the process, however, the monitors racked up ENORMOUS amounts of files. On one machine, the boot drive only has 64GB of space total so the partition where /var/lib/ceph/somethingsomething.db lived was only 27GB. This filled up very, very fast, and eventually killed the monitor on that node. I figured out you can `ceph-monstore-tool compact` or `ceph-kvstore-tool rocksdb /path compact` to get the system to truncate the files in there, but even when I scheduled those jobs to run on each monitor every minute the amount of space being taken up by those rocksdb files grew and grew until they threatened to kill the monitors on the nodes with larger amounts of space too. Other, dumber measures I tried taking to give the system more space for these files ended up screwing up my Proxmox system, so now I have to reinstall. What can be done about this problem so that I don’t have this issue when I try to implement again? Thanks!

4 years, 1 month

2
1
0 0

Re: MGRs failing once per day and generally slow response times

by Janek Bevendorff

Indeed. I just had another MGR go bye-bye. I don't think host clock skew is the problem. On 13/03/2020 15:29, Anthony D'Atri wrote: > Chrony does converge faster, but I doubt this will solve your problem if you don’t have quality peers. Or if it’s not really a time problem. > >> On Mar 13, 2020, at 6:44 AM, Janek Bevendorff <janek.bevendorff(a)uni-weimar.de> wrote: >> >> I replaced ntpd with chronyd and will let you know if it changes anything. Thanks. >> >> >>> On 13/03/2020 06:25, Konstantin Shalygin wrote: >>>> On 3/13/20 12:57 AM, Janek Bevendorff wrote: >>>> NTPd is running, all the nodes have the same time to the second. I don't think that is the problem. >>> As always in such cases - try to switch your ntpd to default EL7 daemon - chronyd. >>> >>> >>> >>> k >> _______________________________________________ >> ceph-users mailing list -- ceph-users(a)ceph.io >> To unsubscribe send an email to ceph-users-leave(a)ceph.io -- Bauhaus-Universität Weimar Bauhausstr. 9a, Room 308 99423 Weimar, Germany Phone: +49 (0)3643 - 58 3577

4 years, 1 month

1
3
0 0

Re: Cephfs mount error 1 = Operation not permitted

by Eugen Block

> I suppose the correct syntax is that anything after "client." is the > name? So: > > ceph fs authorize cephfs client.bob / r / rw > > Would authorize a client named bob? Yes, exactly: admin:~ # ceph fs authorize cephfs client.bob / r / rw [client.bob] key = AQAyw3leAv9tKxAA+wtNEa40yK6svPE/VPlqdA== admin:~ # mount -t ceph mon1:/ /mnt/ -o name=bob,secret=AQAyw3leAv9tKxAA+wtNEa40yK6svPE/VPlqdA== admin:~ # touch /mnt/file Zitat von "Dungan, Scott A." <sdungan(a)caltech.edu>: > That was it! I am not sure how I got confused with the client name > syntax. When I issued the command to create a client key, I used: > > ceph fs authorize cephfs client.1 / r / rw > > I assumed from the syntax that my client name is "client.1" > > I suppose the correct syntax is that anything after "client." is the > name? So: > > ceph fs authorize cephfs client.bob / r / rw > > Would authorize a client named bob? > > -Scott > ________________________________ > From: Eugen Block <eblock(a)nde.ag> > Sent: Monday, March 23, 2020 11:30 AM > To: Dungan, Scott A. <sdungan(a)caltech.edu> > Cc: Yan, Zheng <ukernel(a)gmail.com>; ceph-users(a)ceph.io <ceph-users(a)ceph.io> > Subject: Re: [ceph-users] Re: Cephfs mount error 1 = Operation not permitted > > Wait, your client name is just "1"? In that case you need to specify > that in your mount command: > > mount ... -o name=1,secret=... > > It has to match your ceph auth settings, where "client" is only a > prefix and is followed by the client's name > > [client.1] > > > Zitat von "Dungan, Scott A." <sdungan(a)caltech.edu>: > >> Tried that: >> >> [client.1] >> key = ******************************* >> caps mds = "allow rw path=/" >> caps mon = "allow r" >> caps osd = "allow rw tag cephfs pool=meta_data, allow rw pool=data" >> >> No change. >> >> >> ________________________________ >> From: Yan, Zheng <ukernel(a)gmail.com> >> Sent: Sunday, March 22, 2020 9:28 PM >> To: Dungan, Scott A. <sdungan(a)caltech.edu> >> Cc: Eugen Block <eblock(a)nde.ag>; ceph-users(a)ceph.io <ceph-users(a)ceph.io> >> Subject: Re: [ceph-users] Re: Cephfs mount error 1 = Operation not permitted >> >> On Sun, Mar 22, 2020 at 8:21 AM Dungan, Scott A. >> <sdungan(a)caltech.edu> wrote: >>> >>> Zitat, thanks for the tips. >>> >>> I tried appending the key directly in the mount command >>> (secret=<CLIENT.1.SECRET>) and that produced the same error. >>> >>> I took a look at the thread you suggested and I ran the commands >>> that Paul at Croit suggested even though I the ceph dashboard >>> showed "cephs" as already set as the application on both my data >>> and metadata pools: >>> >>> [root@ceph-n4 ~]# ceph osd pool application set data cephfs data cephfs >>> set application 'cephfs' key 'data' to 'cephfs' on pool 'data' >>> [root@ceph-n4 ~]# ceph osd pool application set meta_data cephfs >>> metadata cephfs >>> set application 'cephfs' key 'metadata' to 'cephfs' on pool 'meta_data' >>> >>> No change. I get the "mount error 1 = Operation not permitted" >>> error the same as before. >>> >>> I also tried manually editing the caps osd pool tags for my >>> client.1, to allow rw to both the data pool as well as the metadata >>> pool, as suggested further in the thread: >>> >>> [client.1] >>> key = *********************************** >>> caps mds = "allow rw path=all" >> >> >> try replacing this with "allow rw path=/" >> >>> caps mon = "allow r" >>> caps osd = "allow rw tag cephfs pool=meta_data, allow rw pool=data" >>> >>> No change. >>> >>> ________________________________ >>> From: Eugen Block <eblock(a)nde.ag> >>> Sent: Saturday, March 21, 2020 1:16 PM >>> To: ceph-users(a)ceph.io <ceph-users(a)ceph.io> >>> Subject: [ceph-users] Re: Cephfs mount error 1 = Operation not permitted >>> >>> I just remembered there was a thread [1] about that a couple of weeks >>> ago. Seems like you need to add the capabilities to the client. >>> >>> [1] >>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/23FDDSYBCDV… >>> >>> >>> Zitat von Eugen Block <eblock(a)nde.ag>: >>> >>> > Hi, >>> > >>> > have you tried to mount with the secret only instead of a secret file? >>> > >>> > mount -t ceph ceph-n4:6789:/ /ceph -o name=client.1,secret=<SECRET> >>> > >>> > If that works your secret file is not right. If not you should check >>> > if the client actually has access to the cephfs pools ('ceph auth >>> > list'). >>> > >>> > >>> > >>> > Zitat von "Dungan, Scott A." <sdungan(a)caltech.edu>: >>> > >>> >> I am still very new to ceph and I have just set up my first small >>> >> test cluster. I have Cephfs enabled (named cephfs) and everything >>> >> is good in the dashboard. I added an authorized user key for cephfs >>> >> with: >>> >> >>> >> ceph fs authorize cephfs client.1 / r / rw >>> >> >>> >> I then copied the key to a file with: >>> >> >>> >> ceph auth get-key client.1 > /tmp/client.1.secret >>> >> >>> >> Copied the file over to the client and then attempt mount witth the >>> >> kernel driver: >>> >> >>> >> mount -t ceph ceph-n4:6789:/ /ceph -o >>> >> name=client.1,secretfile=/root/client.1.secret >>> >> mount error 1 = Operation not permitted >>> >> >>> >> I looked in the logs on the mds (which is also the mgr and mon for >>> >> the cluster) and I don't see any events logged for this. I also >>> >> tried the mount command with verbose and I didn't get any further >>> >> detail. Any tips would be most appreciated. >>> >> >>> >> -- >>> >> >>> >> Scott Dungan >>> >> California Institute of Technology >>> >> Office: (626) 395-3170 >>> >> sdungan(a)caltech.edu<mailto:sdungan@caltech.edu> >>> >> >>> >> _______________________________________________ >>> >> ceph-users mailing list -- ceph-users(a)ceph.io >>> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io >>> >>> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users(a)ceph.io >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users(a)ceph.io >>> To unsubscribe send an email to ceph-users-leave(a)ceph.io

4 years, 1 month

1
0
0 0

Building a Ceph cluster with Ubuntu 18.04 and NVMe SSDs

by Georg Schönberger

Hi Ceph users! We are currently configuring our new production Ceph cluster and I have some questions regarding Ubuntu and NVMe SSDs. Basic setup: - Ubuntu 18.04 with HWE Kernel 5.3 - Deployment via ceph-ansible (Ceph stable "Nautilus") - 5x Nodes with AMD EPYC 7402P CPUs - 25Gbit/s NICs and switches for Ceph private and public network - 4x Intel P4510 2TB NVMe SSDs (all flash) per Node My questions: 1. Should we deploy more than one OSD per NVMe SSD? (as P4510's performance can sustain e.g. 2 OSDs) 2. Does anyone know NVMe specific Linux settings we should enable? 3. Can we use io_uring, if yes how can we enable it? Is it enough to set bluestore_iouring=true? What I know so far: Ad 1: My opinion is to use at least 2 OSDs per NVMEe SSD as the Intel P4510 is fast enough to serve the parallel requests. Please be aware to use the latest firmware version VDV10170 -> with version VDV10131 we had massive stalls on Ceph side! Ad 2: I have already enabled NVMe polling queues, Ubuntu has disabled them by default: Added nvme.poll_queues=1 to /etc/default/grub, then checked /sys/block/nvme1n1/queue/io_poll Cf. https://lore.kernel.org/linux-block/20190318222133.GA24176@localhost.locald… Ad 3: This commit states it should be possible to use io_uring: https://github.com/ceph/ceph/pull/27392 This issue also shows how to set bluestore_iouring=true but it's not clear if any more setup is required, like liburing: https://github.com/axboe/liburing A presentation from Christoph Hellwig shows the advantages: https://www.snia.org/sites/default/files/SDC/2019/presentations/NVMe/Hellwi… Any help and inputs would be appreciated, THX - Georg

4 years, 1 month

1
0
0 0

Exporting

by Rhian Resnick

Evening, We are running into issues exporting a disk image from ceph rbd. When we attempt to export an rbd image in a cache tiered erasure-coded pool on Luminus. All the other disks are working fine but this one is acting up. We have a bit of important data on other disks so obviously want to make sure this doesn't happen to those. [root@ceph-p-mon1 home]# rbd export one/one-177-588-0 one-177-588-0 Exporting image: 8% complete...rbd: error reading from source image at offset 5456789504: (5) Input/output error 2020-03-23 20:11:29.210718 7f2f3effd700 -1 librbd::io::ObjectRequest: 0x7f2f2c128f90 handle_read_object: failed to read from object: (5) Input/output error 2020-03-23 20:11:29.565184 7f2f3e7fc700 -1 librbd::io::ObjectRequest: 0x7f2f280c84d0 handle_read_cache: failed to read from cache: (5) Input/output error Exporting image: 8% complete...failed. rbd: export error: (5) Input/output error Any thoughts would be appreciated. Some info: [root@ceph-p-mon1 home]# rbd info one/one-177-588-0 rbd image 'one-177-588-0': size 58.6GiB in 15000 objects order 22 (4MiB objects) block_name_prefix: rbd_data.84a01279e2a9e3 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten flags: create_timestamp: Fri Apr 20 17:06:09 2018 parent: one/one-177@snap overlap: 2.20GiB [root@ceph-p-mon1 home]# ceph status cluster: id: 6a2e8f21-bca2-492b-8869-eecc995216cc health: HEALTH_OK services: mon: 3 daemons, quorum ceph-p-mon2,ceph-p-mon1,ceph-p-mon3 mgr: ceph-p-mon2(active) mds: cephfsec-1/1/1 up {0=ceph-p-mon2=up:active}, 6 up:standby osd: 155 osds: 154 up, 154 in data: pools: 6 pools, 5904 pgs objects: 145.53M objects, 192TiB usage: 253TiB used, 290TiB / 543TiB avail pgs: 5896 active+clean 8 active+clean+scrubbing+deep io: client: 921KiB/s rd, 5.68MiB/s wr, 110op/s rd, 29op/s wr cache: 5.65MiB/s flush, 0op/s promote [root@ceph-p-mon1 home]# rpm -qa | grep ceph ceph-common-12.2.9-0.el7.x86_64 ceph-mds-12.2.9-0.el7.x86_64 ceph-radosgw-12.2.9-0.el7.x86_64 ceph-mgr-12.2.9-0.el7.x86_64 ceph-12.2.9-0.el7.x86_64 collectd-ceph-5.8.1-1.el7.x86_64 ceph-deploy-2.0.1-0.noarch libcephfs2-12.2.9-0.el7.x86_64 python-cephfs-12.2.9-0.el7.x86_64 ceph-selinux-12.2.9-0.el7.x86_64 ceph-osd-12.2.9-0.el7.x86_64 ceph-base-12.2.9-0.el7.x86_64 ceph-mon-12.2.9-0.el7.x86_64 ceph-release-1-1.el7.noarch Rhian Resnick Associate Director Research Computing Enterprise Systems Office of Information Technology Florida Atlantic University 777 Glades Road, CM22, Rm 173B Boca Raton, FL 33431 Phone 561.297.2647 Fax 561.297.0222 [image] <https://hpc.fau.edu/wp-content/uploads/2015/01/image.jpg>

4 years, 1 month

1
0
0 0

Docker deploy osd

by Oscar Segarra

Hi, I'm not able to bootstrap an OSD container for a physical device or LVM. ¿Anyone has been able to bootstrap it? Sorry if it is not the correct place to post this question. If not, I apologize and I will be grateful if anyone can redirect-me to the correct place. Thanks in advance Oscar

4 years, 1 month

1
1
0 0

2024

2023

2022

2021

2020

2019

ceph-users March 2020