On Wed, Aug 21, 2019 at 11:53 AM Jason Dillaman <jdillama(a)redhat.com> wrote:
> On Aug 21, 2019, at 11:41 AM, Florian Haas <florian(a)citynetwork.eu> wrote:
>
> Hi Jason! Thanks for the quick reply.
>
> On 21/08/2019 16:51, Jason Dillaman wrote:>
>> It just looks like this was an oversight from the OpenStack developers
>> when Nova RBD "direct" ephemeral image snapshot support was added
[1].
>> I would open a bug ticket against Nova for the issue.>
>> [1]
>
https://opendev.org/openstack/nova/commit/824c3706a3ea691781f4fcc4453881517…
>
> OK, wow... that was 4 years ago, does that mean that quiesce/freeze/thaw
> for RBD-backed Nova instances has probably been non-functional
> throughout that time?
Just to clarify, in the initial implementation, only cold snapshots
were supported for RBD [1] so there was no need to quiesce the disk.
The issue was introduced by [2] when that restriction for RBD images
was removed about a year later.
Looking at the
reno for that commit I had an idea for a workaround:
features:
- When RBD is used for ephemeral disks and image storage, make
snapshot use Ceph directly, and update Glance with the new location.
In case of failure, it will gracefully fallback to the "generic"
snapshot method. This requires changing the typical permissions
for the Nova Ceph user (if using authx) to allow writing to
the pool where vm images are stored, and it also requires
configuring Glance to provide a v2 endpoint with direct_url
support enabled (there are security implications to doing this).
See
http://docs.ceph.com/docs/master/rbd/rbd-openstack/ for more
information on configuring OpenStack with RBD.
So, suppose that deployers running Nova with ephemeral disks on RBD
prefer snapshot consistency over this shortcut. Until Nova fixes the
direct_snapshot() call, I figured that such deployers could tweak the
caps for the Nova CephX identity such that that user were no longer
allowed to write to the Glance pool.
Yes, that would be my recommendation.
>
> Under those circumstances, the snapshot creation (in the ephemeral pool)
> would work, but then the clone() call in this line should throw
> nova.exception.Forbidden from an rbd.PermissionError:
>
>
https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac…
>
> Which should then trigger this except block:
>
>
https://opendev.org/openstack/nova/src/commit/7bf75976016aae5d458eca9f6ddac…
>
> ... and Nova/libvirt should go back to the (arguably more correct) fallback.
>
> Would you agree with that assessment, or am I missing something? (Just
> trying to make sure that I don't give the Nova folks the wrong facts.)
>
> Thanks again!
>
> Cheers,
> Florian
[1]
https://opendev.org/openstack/nova/src/commit/824c3706a3ea691781f4fcc445388…
[2]
https://opendev.org/openstack/nova/src/commit/231832354932e26f0d76af1cf1711…
--
Jason