ceph-ansible in Pacific and beyond?

List overview All Threads
Download

newer

older

Bucket index OMAP keys unevenly...

ceph orch status hangs forever

Matthew Vernon

17 Mar 2021 17 Mar '21

10:20 p.m.

Hi, I caught up with Sage's talk on what to expect in Pacific ( https://www.youtube.com/watch?v=PVtn53MbxTc ) and there was no mention of ceph-ansible at all. Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away... Regards, Matthew -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

Show replies by date

Teoman Onay

17 Mar 17 Mar

10:36 p.m.

Hi! AFAIK the focus is on ceph-adm to replace ceph-ansible. Today it is still missing some important features but it is just a matter of time. I don't think that the devs will do twice the work, once for cephadm and once for ceph-ansible but if someone feels the need to keep it working and implement new features , why not. my 2 cents Teoman On Wed, Mar 17, 2021 at 5:50 PM Matthew Vernon <mv3(a)sanger.ac.uk> wrote:

...

Matthew H

10:56 p.m.

There should not be any performance difference between an un-containerized version and a containerized one. The shift to containers makes sense, as this is the general direction that the industry as a whole is taking. I would suggest giving cephadm a try, it's relatively straight forward and significantly faster for deployments then ceph-ansible is. ________________________________ From: Matthew Vernon <mv3(a)sanger.ac.uk> Sent: Wednesday, March 17, 2021 12:50 PM To: ceph-users <ceph-users(a)ceph.io> Subject: [ceph-users] ceph-ansible in Pacific and beyond? Hi, I caught up with Sage's talk on what to expect in Pacific ( https://www.youtube.com/watch?v=PVtn53MbxTc ) and there was no mention of ceph-ansible at all. Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away... Regards, Matthew -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Teoman Onay

11:08 p.m.

A containerized environment just makes troubleshooting more difficult, getting access and retrieving details on Ceph processes isn't as straightforward as with a non containerized infrastructure. I am still not convinced that containerizing everything brings any benefits except the collocation of services. On Wed, Mar 17, 2021 at 6:27 PM Matthew H <matthew.heler(a)hotmail.com> wrote:

...

Fox, Kevin M

11:28 p.m.

There are a lot of benefits to containerization that is hard to do without it. Finer grained ability to allocate resources to services. (This process gets 2g of ram and 1 cpu) Security is better where only minimal software is available within the container so on service compromise its harder to escape. Ability to run exactly what was tested / released by upstream. Fewer issues with version mismatches. Especially useful across different distros. Easier to implement orchestration on top which enables some of the advanced features such as easy to allocate iscsi/nfs volumes. Ceph is finally doing so now that it is focusing on containers. And much more. ________________________________________ From: Teoman Onay <tonay(a)redhat.com> Sent: Wednesday, March 17, 2021 10:38 AM To: Matthew H Cc: Matthew Vernon; ceph-users Subject: [ceph-users] Re: ceph-ansible in Pacific and beyond? Check twice before you click! This email originated from outside PNNL. A containerized environment just makes troubleshooting more difficult, getting access and retrieving details on Ceph processes isn't as straightforward as with a non containerized infrastructure. I am still not convinced that containerizing everything brings any benefits except the collocation of services. On Wed, Mar 17, 2021 at 6:27 PM Matthew H <matthew.heler(a)hotmail.com> wrote:

...

There should not be any performance difference between an un-containerized version and a containerized one. The shift to containers makes sense, as this is the general direction that the industry as a whole is taking. I would suggest giving cephadm a try, it's relatively straight forward and significantly faster for deployments then ceph-ansible is. ________________________________ From: Matthew Vernon <mv3(a)sanger.ac.uk> Sent: Wednesday, March 17, 2021 12:50 PM To: ceph-users <ceph-users(a)ceph.io> Subject: [ceph-users] ceph-ansible in Pacific and beyond? Hi, I caught up with Sage's talk on what to expect in Pacific ( https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtu… ) and there was no mention of ceph-ansible at all. Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away... Regards, Matthew -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

_______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Martin Verges

18 Mar 18 Mar

12:19 a.m.

Hello,

...

Finer grained ability to allocate resources to services. (This process

gets 2g of ram and 1 cpu) do you really believe this is a benefit? How can it be a benefit to have crashing or slow OSDs? Sounds cool but doesn't work in most environments I ever had my hands on. We often encounter cluster that fall apart or have a meltdown just because they run out of memory and we use tricks like zram to help them out and recover their clusters. If I now go and do it per container/osd in a finer grained way, it will just blow up even more. -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.verges(a)croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io YouTube: https://goo.gl/PGE1Bx Am Mi., 17. März 2021 um 18:59 Uhr schrieb Fox, Kevin M <Kevin.Fox(a)pnnl.gov

...

> There are a lot of benefits to containerization that is hard to do without > it.

...

Finer grained ability to allocate resources to services. (This process

> gets 2g of ram and 1 cpu) > Security is better where only minimal software is available within the > container so on service compromise its harder to escape. > Ability to run exactly what was tested / released by upstream. Fewer > issues with version mismatches. Especially useful across different distros. > Easier to implement orchestration on top which enables some of the > advanced features such as easy to allocate iscsi/nfs volumes. Ceph is > finally doing so now that it is focusing on containers. > And much more. > > ________________________________________ > From: Teoman Onay <tonay(a)redhat.com> > Sent: Wednesday, March 17, 2021 10:38 AM > To: Matthew H > Cc: Matthew Vernon; ceph-users > Subject: [ceph-users] Re: ceph-ansible in Pacific and beyond? > > Check twice before you click! This email originated from outside PNNL. > > > A containerized environment just makes troubleshooting more difficult, > getting access and retrieving details on Ceph processes isn't as > straightforward as with a non containerized infrastructure. I am still not > convinced that containerizing everything brings any benefits except the > collocation of services. > > On Wed, Mar 17, 2021 at 6:27 PM Matthew H <matthew.heler(a)hotmail.com> > wrote: > > > There should not be any performance difference between an > un-containerized > > version and a containerized one. > > > > The shift to containers makes sense, as this is the general direction > that > > the industry as a whole is taking. I would suggest giving cephadm a try, > > it's relatively straight forward and significantly faster for deployments > > then ceph-ansible is. > > > > ________________________________ > > From: Matthew Vernon <mv3(a)sanger.ac.uk> > > Sent: Wednesday, March 17, 2021 12:50 PM > > To: ceph-users <ceph-users(a)ceph.io> > > Subject: [ceph-users] ceph-ansible in Pacific and beyond? > > > > Hi, > > > > I caught up with Sage's talk on what to expect in Pacific ( > > > https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtu… > ) and there was no mention > > of ceph-ansible at all. > > > > Is it going to continue to be supported? We use it (and uncontainerised > > packages) for all our clusters, so I'd be a bit alarmed if it was going > > to go away... > > > > Regards, > > > > Matthew > > > > > > -- > > The Wellcome Sanger Institute is operated by Genome Research > > Limited, a charity registered in England with number 1021457 and a > > company registered in England with number 2742969, whose registered > > office is 215 Euston Road, London, NW1 2BE. > > _______________________________________________ > > ceph-users mailing list -- ceph-users(a)ceph.io > > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > _______________________________________________ > > ceph-users mailing list -- ceph-users(a)ceph.io > > To unsubscribe send an email to ceph-users-leave(a)ceph.io > > > > > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users(a)ceph.io > To unsubscribe send an email to ceph-users-leave(a)ceph.io >

Marc

2:07 a.m.

...

Finer grained ability to allocate resources to services. (This process

Indeed it mostly makes sense in a shared environment or so. Where you are forced to isolate processes or you want your containers to scale or migrate automatically. But it is not like osd.23 is ever going to move from hosta to hostb. So you can not use the benefits of limiting cpu and memory resources, you cannot use the benefits of network isolation, you cannot use the benefits of migrating tasks, you cannot use the benefits of scaling tasks. And this applies to >90% of your processes (the osd's) In dedicated ceph cluster you create extra complexity and dependencies on storage of your container images. I guess the container images are 300MB 400MB for an osd or so (assuming they are el7)? I would say that ceph should start supporting something like alpine linux packages or so, so you have a rgw, monitor, mds that is only 20MB. Furthermore something needs to change in mon, mds and rgw design. Afik each rgw still requires a unique client id, which makes it not very friendly to scaling such a task.

Martin Verges

12:21 a.m.

...

I am still not convinced that containerizing everything brings any

benefits except the collocation of services. Is there even a benefit? We as croit collocate all our services from Ceph itself MON,MGR,MDS,OSD,... as well as ISCSI, SMB, NFS,... on the same host. No problem with that, not a single one. -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.verges(a)croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io YouTube: https://goo.gl/PGE1Bx Am Mi., 17. März 2021 um 18:39 Uhr schrieb Teoman Onay <tonay(a)redhat.com>om>:

...

There should not be any performance difference between an

un-containerized

version and a containerized one. The shift to containers makes sense, as this is the general direction

that

the industry as a whole is taking. I would suggest giving cephadm a try, it's relatively straight forward and significantly faster for deployments then ceph-ansible is. ________________________________ From: Matthew Vernon <mv3(a)sanger.ac.uk> Sent: Wednesday, March 17, 2021 12:50 PM To: ceph-users <ceph-users(a)ceph.io> Subject: [ceph-users] ceph-ansible in Pacific and beyond? Hi, I caught up with Sage's talk on what to expect in Pacific ( https://www.youtube.com/watch?v=PVtn53MbxTc ) and there was no mention of ceph-ansible at all. Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away... Regards, Matthew -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

_______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Stefan Kooman

12:39 a.m.

On 3/17/21 7:51 PM, Martin Verges wrote:

...

I am still not convinced that containerizing everything brings any

benefits except the collocation of services. Is there even a benefit?

Decoupling from underlying host OS. On a test cluster I'm running Ubuntu Focal on the host (and a bunch of other stuff, hyperconverged setup) and for testing purposes I needed to run Ceph Mimic there. No Mimic (or Nautilus packges for that matter) available on Ubuntu Focal. In that case it can be convenient to "just run a container". Sure you can build them yourselves. And with the pxe way of deploying you have more flexibility, but most setups (I guess) are not like that. So, for me that was a benefit. I can think of other potential benefits, but I don't want to go there, asof yet, as I still need to convince myself containers are a proper solution to deploy software as far as Ceph is concerned. But it does look promising. Gr. Stefan

Oliver Freyermuth

3:23 a.m.

Am 17.03.21 um 20:09 schrieb Stefan Kooman:

...

On 3/17/21 7:51 PM, Martin Verges wrote:

I am still not convinced that containerizing everything brings any

benefits except the collocation of services. Is there even a benefit?

I think the main issue, trying to summarize the views in this thread, is that the "container path" seems to be becoming the only offering (apart from the manual installation steps). ceph-deploy and now also ceph-ansible seem to have lost support / be slowly losing support. I already triggered a discussion taking quite similar turns a while earlier: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/4N5EEQWQ246… and there was similar feedback from many voices, some even holding off on upgrading Ceph to newer versions due to this move. I personally still agree containers have their advantages in many use cases, but they also have their downsides in other (mostly complementary) setups.[0] There's no free lunch, but often more than one kind of lunch you can choose to your taste, especially in the modular world of Unix. My main message is that there seems to be a general need for a way which users can integrate with their potentially intricate, granular setup, in which they have and want to keep full control of all components. I believe this is something many voices ask for in addition to the "container offering". There are of course the manual installation commands, but running these comes with a big disadvantage: They need to be adapted often between Ceph versions. After all, the project is active, gains new features, and commands gain new flags, and things are deprecated or upgraded (and this is good!). That means anything building on top of the manual commands (be it ceph-deploy, ceph-ansible or anything else) will break early and break often, unless it receives a lot of energy from maintainers. Just looking at the commit histories of these projects easily reveals many "brain hours" have been burnt there, always racing to keep up with new Ceph releases. However, my understanding of the cephadm "orchestrator layer" is that it is modular, and essentially encapsulates many cumbersome "lots of manual steps"-commands into "high level" commands. From this (without checking the code, so please correct me if wrong!) I would expect it to be "easy" to, for example, have an SSH orchestrator running bare-metal, i.e. essentially like "ceph-deploy", since the same commands now being run in containers by the orchestrator could be executed bare-metal after installation of the packages. Of course, some things like an "upgrade" of all components in one orchestration step will not work this way — but I wonder if the users asking for "full control" actually ask for that. This "high level language" could then be used to integrate with ceph-ansible or other special tools people may use, abstracting away the manual commands, removing most of the complexity and breakage whenever the manual commands are changed, significantly reducing the maintenance burden for these tools and other toolings users may be using. In short, it would define an "API" which can be integrated into other tools, following the modular Unix spirit. That is what I actually thought the SSH orchestrator (as it was called in the Nautilus docs once) was going to be, and then it turned out to be limited to the specific container use case (at least for now). Of course, this implementation is not going to come "for free", somebody needs to write it. I hope my summary was inspirational for someone out there knowing the existing orchestrator code and maybe interested in adding this (if it is of general interest, which I believe it is). And I hope this won't reduce the thriving user base of Ceph in the long run, since I am still in love with this project in general (and the community around it!) and will keep recommending it :-). Cheers, Oliver [0] For example, when you have a full stack automated with Puppet or any other configuration management, control of the OS updates, testing machines you upgrade / reconfigure first before applying changes to production, monitoring of all services etc., then adding containers for storage services is just another layer of complexity. All you get in such a case is just another "thing" you need to update (it will by design also contain libraries not part of Ceph), you need to monitor for CVEs, you may need to re-pull an updated container and restart at different times from all the other parts of your infrastructure, and so on. Things like RDMA become harder, for which drivers and libraries from the "machine OS" may need to be matched up or mixed in, shared libraries in the container use extra memory since they are of different versions than those from the OS etc. Containers are good for isolation, reproducibilty and mobility of compute, but you don't gain so much when you don't "re-orchestrate" often. So as usual, mileage may vary.

Lars Täuber

12:06 p.m.

I vote for an SSH orchestrator for a bare metal installation too! (And no, I'm not able to write it.) Our second Ceph cluster is underway, and I don't know if we ever update our first cluster (nautilus) to a containerized version. It is constructed a special way. Thanks! Lars

Milan Kupcevic

9:40 p.m.

On 3/18/21 2:36 AM, Lars Täuber wrote:

...

I vote for an SSH orchestrator for a bare metal installation too!

+1 Cephadm with a no containers option would do. Milan -- Milan Kupcevic Senior Cyberinfrastructure Engineer at Project NESE Harvard University FAS Research Computing

Reed Dier

19 Mar 19 Mar

1:27 a.m.

I too would be amenable to a cephadm orchestrator on bare metal if that were an option. I thought that I would drop this video for anyone that hasn't seen it yet. It hits on a lot of people's sentiments about containers being not great for debugging, while also going into some of the pros to the containerization at the same time. Great presentation. https://www.youtube.com/watch?v=pPZsN_urpqw <https://www.youtube.com/watch?v=pPZsN_urpqw> Reed

...

On Mar 18, 2021, at 11:10 AM, Milan Kupcevic <milan_kupcevic(a)harvard.edu> wrote: On 3/18/21 2:36 AM, Lars Täuber wrote:

I vote for an SSH orchestrator for a bare metal installation too!

+1 Cephadm with a no containers option would do. Milan -- Milan Kupcevic Senior Cyberinfrastructure Engineer at Project NESE Harvard University FAS Research Computing _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Matthew H

18 Mar 18 Mar

12:47 a.m.

"A containerized environment just makes troubleshooting more difficult, getting access and retrieving details on Ceph processes isn't as straightforward as with a non containerized infrastructure. I am still not convinced that containerizing everything brings any benefits except the collocation of services." It changes the way you troubleshoot, but I don't find it more difficult in the issues I have seen and had. Even today without containers, all services can be co-located within the same hosts (mons,mgrs,osds,mds).. Is there a situation you've seen where that has not been the case? ________________________________ From: Teoman Onay <tonay(a)redhat.com> Sent: Wednesday, March 17, 2021 1:38 PM To: Matthew H <matthew.heler(a)hotmail.com> Cc: Matthew Vernon <mv3(a)sanger.ac.uk>uk>; ceph-users <ceph-users(a)ceph.io> Subject: Re: [ceph-users] Re: ceph-ansible in Pacific and beyond? A containerized environment just makes troubleshooting more difficult, getting access and retrieving details on Ceph processes isn't as straightforward as with a non containerized infrastructure. I am still not convinced that containerizing everything brings any benefits except the collocation of services. On Wed, Mar 17, 2021 at 6:27 PM Matthew H <matthew.heler@hotmail.com<mailto:matthew.heler@hotmail.com>> wrote: There should not be any performance difference between an un-containerized version and a containerized one. The shift to containers makes sense, as this is the general direction that the industry as a whole is taking. I would suggest giving cephadm a try, it's relatively straight forward and significantly faster for deployments then ceph-ansible is. ________________________________ From: Matthew Vernon <mv3@sanger.ac.uk<mailto:mv3@sanger.ac.uk>> Sent: Wednesday, March 17, 2021 12:50 PM To: ceph-users <ceph-users@ceph.io<mailto:ceph-users@ceph.io>> Subject: [ceph-users] ceph-ansible in Pacific and beyond? Hi, I caught up with Sage's talk on what to expect in Pacific ( https://www.youtube.com/watch?v=PVtn53MbxTc ) and there was no mention of ceph-ansible at all. Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away... Regards, Matthew -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE. _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-leave@ceph.io<mailto:ceph-users-leave@ceph.io> _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io> To unsubscribe send an email to ceph-users-leave@ceph.io<mailto:ceph-users-leave@ceph.io>

Janne Johansson

1:39 p.m.

Den ons 17 mars 2021 kl 20:17 skrev Matthew H <matthew.heler(a)hotmail.com>om>:

...

New ceph users pop in all the time on the #ceph IRC and have absolutely no idea on how to see the relevant logs from the containerized services. Me being one of the people that do run services on bare metal (and VMs) I actually can't help them, and it seems several other old ceph admins can't either. Not that it is impossible or might not even be hard to get them, but somewhere in the "it is so easy to get it up and running, just pop a container and off you go" docs there seem to be a lack of the parts "when the OSD crashes at boot, run this to export the file normally called /var/log/ceph/ceph-osd.12.log" meaning it becomes a black box to the users and they are left to wipe/reinstall or something else when it doesn't work. At the end, I guess the project will see less useful reports with Assert Failed logs from impossible conditions and more people turning away from something that could be fixed in the long run. I get some of the advantages, and for stateless services elsewhere it might be gold to have containers, I am not equally enthusiastic about it for ceph. -- May the most significant bit of your life be positive.

Wido den Hollander

2:11 p.m.

On 18/03/2021 09:09, Janne Johansson wrote:

...

Den ons 17 mars 2021 kl 20:17 skrev Matthew H <matthew.heler(a)hotmail.com>om>:

Me being one of them. Yes, it's all possible with containers, but it's different. And I don't see the true benefit of running Ceph in Docker just yet. Another layer of abstraction which you need to understand. Also, when you need to do real emergency stuff like working with ceph-objectstore-tool to fix broken OSDs/PGs it's just much easier to work on a bare-metal box than with containers (if you ask me). So no, I am not convinced yet. Not against it, but personally I would say it's not the only way forward. DEB and RPM packages are still alive and kicking. Wido > Not that it is impossible or might not even be hard to get them, but > somewhere in the "it is so easy to get it up and running, just pop a > container and off you go" docs there seem to be a lack of the parts > "when the OSD crashes at boot, run this to export the file normally > called /var/log/ceph/ceph-osd.12.log" meaning it becomes a black box > to the users and they are left to wipe/reinstall or something else > when it doesn't work. At the end, I guess the project will see less > useful reports with Assert Failed logs from impossible conditions and > more people turning away from something that could be fixed in the > long run. > > I get some of the advantages, and for stateless services elsewhere it > might be gold to have containers, I am not equally enthusiastic about > it for ceph. >

Martin Verges

2:19 p.m.

...

So no, I am not convinced yet. Not against it, but personally I would say

it's not the only way forward. 100% agree to your whole answer -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.verges(a)croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io YouTube: https://goo.gl/PGE1Bx On Thu, 18 Mar 2021 at 09:42, Wido den Hollander <wido(a)42on.com> wrote:

...

On 18/03/2021 09:09, Janne Johansson wrote:

Den ons 17 mars 2021 kl 20:17 skrev Matthew H <matthew.heler(a)hotmail.com : > > "A containerized environment just makes troubleshooting more difficult,

getting access and retrieving details on Ceph processes isn't as straightforward as with a non containerized infrastructure. I am still not convinced that containerizing everything brings any benefits except the collocation of services."

> > It changes the way you troubleshoot, but I don't find it more difficult

in the issues I have seen and had. Even today without containers, all services can be co-located within the same hosts (mons,mgrs,osds,mds).. Is there a situation you've seen where that has not been the case?

Not that it is impossible or might not even be hard to get them, but somewhere in the "it is so easy to get it up and running, just pop a container and off you go" docs there seem to be a lack of the parts "when the OSD crashes at boot, run this to export the file normally called /var/log/ceph/ceph-osd.12.log" meaning it becomes a black box to the users and they are left to wipe/reinstall or something else when it doesn't work. At the end, I guess the project will see less useful reports with Assert Failed logs from impossible conditions and more people turning away from something that could be fixed in the long run. I get some of the advantages, and for stateless services elsewhere it might be gold to have containers, I am not equally enthusiastic about it for ceph.

_______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

Jaroslaw Owsiewski

2:29 p.m.

czw., 18 mar 2021 o 09:42 Wido den Hollander <wido(a)42on.com> napisał(a):

...

Stefan Kooman

3:55 p.m.

On 3/18/21 9:09 AM, Janne Johansson wrote:

...

Den ons 17 mars 2021 kl 20:17 skrev Matthew H <matthew.heler(a)hotmail.com>om>:

New ceph users pop in all the time on the #ceph IRC and have absolutely no idea on how to see the relevant logs from the containerized services.

While you might not need that much Ceph knowledge to get Ceph up and running, it does require users to know how container deployments work. I had to put quite a bit of work in to get what ceph-ansible was doing to deploy the containers, and why it would fail (after some other tries). You do need to have Ceph knowledge, still, when things do not go as expected, and even beforehand to make the right decisions on how to set up all the infrastructure. So arguably you need even more knowledge to understand what is going on under the hood, be it Ceph or containers.

...

Me being one of the people that do run services on bare metal (and VMs) I actually can't help them, and it seems several other old ceph admins can't either. Not that it is impossible or might not even be hard to get them, but somewhere in the "it is so easy to get it up and running, just pop a container and off you go" docs there seem to be a lack of the parts "when the OSD crashes at boot, run this to export the file normally called /var/log/ceph/ceph-osd.12.log" meaning it becomes a black box to the users and they are left to wipe/reinstall or something else when it doesn't work. At the end, I guess the project will see less useful reports with Assert Failed logs from impossible conditions and more people turning away from something that could be fixed in the long run.

There is a ceph manager module for that: https://docs.ceph.com/en/latest/mgr/crash/ I guess an option to "always send crash logs to Ceph" could be build in. If you trust Ceph with this data of course (opt-in).

...

I get some of the advantages, and for stateless services elsewhere it might be gold to have containers, I am not equally enthusiastic about it for ceph.

Yeah, so I think it's good to discuss pros and cons and see what problem it solves, and what extra problems it creates. Gr. Stefan

Guillaume Abrioux

8:33 p.m.

Hi all, ceph-ansible(a)stable-6.0 supports pacific and the current content in the branch 'master' (future stable-7.0) is intended to support Ceph Quincy. I can't speak on behalf of Dimitri but I'm personally willing to keep maintaining ceph-ansible if there are interests, but people must be aware that: - the official and supported installer will be cephadm, - officially, ceph-ansible will be unsupported with the consequences it might bring (testing efforts, ci ressources, etc.), - Fewer engineering efforts will be carried out (contributions are welcome!) Thanks, On Thu, 18 Mar 2021 at 11:26, Stefan Kooman <stefan(a)bit.nl> wrote:

...

On 3/18/21 9:09 AM, Janne Johansson wrote:

Den ons 17 mars 2021 kl 20:17 skrev Matthew H <matthew.heler(a)hotmail.com : > > "A containerized environment just makes troubleshooting more difficult,

> > It changes the way you troubleshoot, but I don't find it more difficult

New ceph users pop in all the time on the #ceph IRC and have absolutely no idea on how to see the relevant logs from the containerized services.

I get some of the advantages, and for stateless services elsewhere it might be gold to have containers, I am not equally enthusiastic about it for ceph.

Yeah, so I think it's good to discuss pros and cons and see what problem it solves, and what extra problems it creates. Gr. Stefan _______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

-- *Guillaume Abrioux*Senior Software Engineer

Matthew Vernon

9:53 p.m.

New subject: ceph-ansible in Pacific and beyond? [EXT]

Hi, On 18/03/2021 15:03, Guillaume Abrioux wrote:

...

ceph-ansible(a)stable-6.0 supports pacific and the current content in the branch 'master' (future stable-7.0) is intended to support Ceph Quincy. I can't speak on behalf of Dimitri but I'm personally willing to keep maintaining ceph-ansible if there are interests, but people must be aware that:

This is good to know, thank you :) I hadn't realised my question would spawn such a monster thread! Regards, Matthew -- The Wellcome Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.

Milan Kupcevic

4:39 a.m.

On 3/17/21 1:38 PM, Teoman Onay wrote:

...

+1 To a man with a Docker, everything looks like a container. Milan -- Milan Kupcevic Senior Cyberinfrastructure Engineer at Project NESE Harvard University FAS Research Computing

Alexander E. Patrakov

8:24 a.m.

I agree with this sentiment. Please do not make a containerized and orchestrated deployment mandatory until all of the documentation is rewritten to take this deployment scenario into account. Also, in the past year, I have personally tested three Ceph training courses from various vendors. They all share the same weakness: explain how to deal with failed OSD disks in a non-containerized scenario, how to redeploy the OSD after replacing the disk, then at the end - how to take the cluster over using cephadm, and as a result, suddenly the "how to replace a disk and redeploy the OSD" knowledge is inapplicable. ср, 17 мар. 2021 г. в 22:39, Teoman Onay <tonay(a)redhat.com>om>:

...

_______________________________________________ ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io

-- Alexander E. Patrakov CV: http://u.pc.cd/wT8otalK

Milan Kupcevic

4:38 a.m.

On 3/17/21 1:26 PM, Matthew H wrote:

...

There should not be any performance difference between an un-containerized version and a containerized one.

That is right. Let us choose which one fits our setup better. Milan -- Milan Kupcevic Senior Cyberinfrastructure Engineer at Project NESE Harvard University FAS Research Computing

Stefan Kooman

19 Mar 19 Mar

10:41 a.m.

On 3/17/21 5:50 PM, Matthew Vernon wrote:

...

Just a reminder to all of you. Please fill in the Ceph-user survey and make your voice heard [1]. Gr. Stefan [1]: https://ceph.io/user-survey/

Gregory Orange

20 May 20 May

2:02 p.m.

Hi, On 19/3/21 1:11 pm, Stefan Kooman wrote:

...

Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away...

Just a reminder to all of you. Please fill in the Ceph-user survey and > make your voice heard [1].

Bother, I missed it. Thanks to some colleagues for pointing me to this thread anyway. Also thank you everyone for the thoughtful contributions here. The added complexity of containers has not been worthwhile for us thus far, and ceph-ansible is the right tool for our purposes. Of course we will adjust if we are required to, but the preference will clearly be to stick with ceph-ansible and non-containerised until other ways suit our environment better. This is another vote for preserving both of them at least for now, and certainly not banishing them both at the same time. Regards, Greg.

1072

days inactive

1136

days old

ceph-users@ceph.io

Manage subscription

25 comments

17 participants

tags (0)

participants (17)

Alexander E. Patrakov
Fox, Kevin M
Gregory Orange
Guillaume Abrioux
Janne Johansson
Jaroslaw Owsiewski
Lars Täuber
Marc
Martin Verges
Matthew H
Matthew Vernon
Milan Kupcevic
Oliver Freyermuth
Reed Dier
Stefan Kooman
Teoman Onay
Wido den Hollander