[ceph-users] Re: Why you might want packages not containers for Ceph deployments

19 Jun 2021

Hello Sage,

...
  ...I think that part of this comes down to a learning
curve...
 ...cephadm represent two of the most successful efforts to address usability...

Somehow it does not look right to me.

There is much more to operate a Ceph cluster than just deploying software.
Of course that helps on the short run to avoid that people leave the train
right when they started their Ceph journey. But the harder part is what to
do if shit hit's the fan and your cluster is down due to some issues and
then having additional layers of complexity kicking in and biting your ass.
Just saying, that day2 ops is much more important than getting a cluster
up&running. In my believe, no admin want to dig around containers and other
abstractions when the single most important part of a whole IT
infrastructure stops working. But just my thought, maybe I'm wrong.

In my opinion, the best possible way to run IT software is KISS, keep it
stupid simple. No additional layers, no abstractions of abstractions and
good error messages.

For example the docker topic here looks like something that can be
showcased:
...
  Question: If it uses docker and docker daemon fails
what happens to you containers?
...
  Answer: This is an obnoxious feature of docker 
As you might see, you need a lot of knowledge about abstraction layers to
operate them well. Docker for example provides so called live-restore (
https://docs.docker.com/config/containers/live-restore/) that allows you to
stop the daemon without killing your containers. This enables you to update
docker daemon without downtimes but you have to know it and of course
enable it. This can make operating a Ceph cluster harder, not easier.

What about more sophisticated features, for example performance. Ceph
already is not a fast storage solution with way to high latency. Does it
help to add containers instead of going more direct to the hardware and
reduce overhead? Of course you can run SPDK and/or DPDK inside containers,
but does it make it better and faster or even easier? If you need
high-performance storage today, you can turn to open source alternatives
that are massively cheaper per IO and only minimally more expensive per GB.
I therefore believe, stripping out overhead is also an important topic for
the future of Ceph.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.verges(a)croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

On Fri, 18 Jun 2021 at 20:43, Sage Weil &lt;sage(a)newdream.net&gt; wrote:

...
  Following up with some general comments on the main
container
 downsides and on the upsides that led us down this path in the first
 place.

 Aside from a few minor misunderstandings, it seems like most of the
 objections to containers boil down to a few major points:

  Containers are more complicated than packages,
making debugging harder. 
 I think that part of this comes down to a learning curve and some
 semi-arbitrary changes to get used to (e.g., systemd unit name has
 changed; logs now in /var/log/ceph/$fsid instead of /var/log/ceph).
 Another part of these changes are real hoops to jump through: to
 inspect process(es) inside a container you have to `cephadm enter
 --name ...`; ceph CLI may not be automatically installed on every
 host; stracing or finding coredumps requires extra steps. We're
 continuing to improve the tools etc so please call these things out as
 you see them!

  Security (50 containers -> 50 versions of
openssl to patch) 
 This feels like the most tangible critique.  It's a tradeoff.  We have
 had so many bugs over the years due to varying versions of our
 dependencies that containers feel like a huge win: we can finally test
 and distribute something that we know won't break due to some random
 library on some random distro.  But it means the Ceph team is on the
 hook for rebuilding our containers when the libraries inside the
 container need to be patched.

 On the flip side, cephadm's use of containers offer some huge wins:

 - Package installation hell is gone.  Previously, ceph-deploy and
 ceph-ansible had thousands of lines of code to deal with the myriad
 ways that packages could be installed and where they could be
 published.  With containers, this now boils down to a single string,
 which is usually just something like "ceph/ceph:v16".  We're grown a
 handful of complexity there to let you log into private registries,
 but otherwise things are so much simpler.  Not to mention what happens
 when package dependencies break.
 - Upgrades/downgrades can be carefully orchestrated. With packages,
 the version change is by host, with a limbo period (and occasional
 SIGBUS) before daemons were restarted.  Now we can run new or patched
 code on individual daemons and avoid an accidental upgrade when a
 daemon restarts.  (Also, running e.g. ceph CLI commands no longer
 error out with a dynamic linker error while the package upgrade itself
 is in progress, something all of our automated upgrade tests have to
 carefully avoid to prevent intermittent failures.)
 - Ceph installations are carefully sandboxed.  Removing/scrubbing ceph
 from a host is trivial as only a handful of directories or
 configuration files are touched.  And we can safely run multiple
 clusters on the same machine without worry about bad interactions
 (mostly great for development, but also handy for users experimenting
 with new features etc).
 - Cephadm deploys a bunch of non-ceph software as well to provide a
 complete storage system, including haproxy and keepalived for HA
 ingress for RGW and NFS, ganesha for NFS service, grafana, prometheus,
 node-exporter, and (soon) samba for SMB.  All neatly containerized to
 avoid bumping into other software on the host; testing and supporting
 the huge matrix of packages versions available via various distros
 would be a huge time sink.

 Most importantly, cephadm and the orchestrator API vastly improve the
 overall ceph experience from the CLI and dashboard.  Users no longer
 have to give any thought to where and which daemons run if they don't
 want to (or they can carefully specify daemon placement if they
 choose).  And users can use commands like 'ceph fs volume create foo'
 and the fs will get created *and* MDS daemons will be started all in
 one go.  (This would also be possible with a package-based
 orchestrator implementation if one existed.)

 We've been beat up for years about how complicated and hard Ceph is.
 Rook and cephadm represent two of the most successful efforts to
 address usability (and not just because they enable deployment
 management via the dashboard!), and taking advantage of containers was
 one expedient way to get to where we needed to go.  If users feel
 strongly about supporting packages, we can get much of the same
 experience with another package-based orchestrator module.  My view,
 though, is that we have much higher priority problems to tackle.

 sage
 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io

2024

2023

2022

2021

2020

2019

[ceph-users] Re: Why you might want packages not containers for Ceph deployments