Thanks!
–––––––––
Sébastien Han
Senior Principal Software Engineer, Storage Architect
"Always give 100%. Unless you're giving blood."
On Fri, Dec 6, 2019 at 1:44 PM Alfredo Deza <adeza(a)redhat.com> wrote:
On Fri, Dec 6, 2019 at 5:59 AM Sebastien Han <shan(a)redhat.com> wrote:
Hi,
Following up on my previous ceph-volume email as promised.
When running Ceph with Rook in Kubernetes in the Cloud (Aws, Azure,
Google, whatever), the OSDs are backed by PVC (Cloud block storage)
attached to virtual machines.
This makes the storage portable if the VM dies, the device will be
attached to a new virtual machine and the OSD will resume running.
In Rook, we have 2 main deployments for the OSD:
1. Prepare the disk to become an OSD
Prepare will run on the VM, attach the block device, run "ceph-volume
prepare", then this gets complicated. After this, the device is
supposed to be detached from the VM because the container terminated.
However, the block is still held by LVM so the VG must be
de-activated. Currently, we do this in Rook, but it would be nice to
de-activate the VG once ceph-volume is done preparing the disk in a
container.
2. Activate the OSD.
Now, onto the new container, the device is attached again on the VM.
At this point, more changes will be required in ceph-volume,
particularly in the "activate" call.
a. ceph-volume should activate the VG
By VG you mean LVM's Volume Group?
b. ceph-volume should activate the device
normally
Not "normally" though right? That would imply starting the OSD which
you are indicating is not desired.
c. ceph-volume should run the ceph-osd process
in foreground as well
as accepting flag to that CLI, we could have something like:
"ceph-volume lvm activate --no-systemd $STORE_FALG $OSD_ID $OSD_UUID
<a bunch of flags>"
Perhaps we need a new flag to indicate we want to run the osd
process in foreground?
Here is an example on how an OSD run today:
ceph-osd --foreground --id 2 --fsid
9a531951-50f2-4d48-b012-0aef0febc301 --setuser ceph --setgroup ceph
--crush-location=root=default host=minikube --default-log-to-file
false --ms-learn-addr-from-peer=false
--> we can have a bunch of flags or an ENV var with all the flags
whatever you prefer.
This wrapper should watch for signals too, it should reply to
SIGTERM in the following way:
- stop the OSD
- de-activate the VG
- exit 0
Just a side note, the VG must be de-activated when the container stops
so that the block device can be detached from the VMs, otherwise,
it'll still be held by LVM.
I am worried that this goes beyond what I consider the scope of
ceph-volume which is: prepare device(s) to be part of an OSD.
Catching signals, handling the OSD in the foreground, and accepting
(proxying) flags, sounds problematic for a robust implementation in
ceph-volume, even
if that means it will help Rook in this case.
The other challenge I see is that it seems Ceph is in a transition
from being a baremetal project to a container one, except lots of
tooling (like ceph-volume) is deeply
tied to the non-containerized workflows. This makes it difficult (and
non-obvious!) in ceph-volume when adding more flags to do things that
help the containerized
deployment.
To solve the issues you describe, I think you need either a separate
command-line tool that can invoke ceph-volume with the added features
you listed, or
if there is significant push to get more things in ceph-volume, a
separate sub-command, so that the `lvm` is isolated from the
conflicting logic.
My preference would be a wrapper script, separate from the Ceph project.
Hopefully, I was clear :).
This is just a proposal if you feel like this could be done
differently, feel free to suggest.
Thanks!
–––––––––
Sébastien Han
Senior Principal Software Engineer, Storage Architect
"Always give 100%. Unless you're giving blood."
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io