That makes sense. Thanks Ilya.
On Mon, Apr 13, 2020 at 4:10 AM Ilya Dryomov <idryomov(a)gmail.com> wrote:
As Paul said, a lock is typically broken by a new
client trying
to grab it. As part of that the existing lock holder needs to be
blacklisted, unless you fence using some type of STONITH.
The question of whether the existing lock holder is dead can't be
answered in isolation. For example, the exclusive-lock feature
(automatic cooperative lock transitions to ensure that only a single
client is writing to the image at any given time) uses watches. If the
existing lock holder has a watch, it is considered alive and the lock
is requested cooperatively. Otherwise, it is considered dead and the
lock is broken. This is implemented with care to avoid various corner
cases related to watches and blacklisting: the client will not grab
the lock without having a watch established, the client will update
the lock cookie if the watch is lost and reestablished, the client
will not use pre-blacklist osdmaps for any post-blacklist I/O, etc.
Since you are grabbing locks manually in the orchestration layer,
it is up to the orchestration layer to decide when (and how) to break
them. rbd can't make that decision for you -- consider a case where
the device is alive and ready to serve I/O, but the workload is stuck
for some other reason.
Thanks,
Ilya
On Sun, Apr 12, 2020 at 8:42 PM Void Star Nill <void.star.nill(a)gmail.com>
wrote:
Paul, Ilya, others,
Any inputs on this?
Thanks,
Shridhar
On Thu, 9 Apr 2020 at 12:30, Void Star Nill <void.star.nill(a)gmail.com>
wrote:
>
> Thanks Ilya, Paul.
>
> I dont have the panic traces and probably they are not related to rbd.
I was
merely describing our use case.
>
> On our setup that we manage, we have a software layer similar to
Kubernetes
CSI that orchestrates the volume map/unmap on behalf of the
users. We are currently using volume locks as a way to protect the volumes
from inadvertent concurrent write mounts that could lead to FS corruption
as most of the volumes run with ext3/4.
>
> So in our orchestration, we take a shared on volumes that are read-only
mounts, thus we can allow concurrent multiple read-only mounts, and we take
exclusive lock for read-write mounts so that we can reject other RO/RW
mounts while the first RW mount is in use.
>
> All this orchestration happens in a distributed manner across all our
compute
nodes - so it is not easy to determine when we should kick out the
dead connections and claim the lock. We need to intervene manually and
resolve such issues as of now. So I am looking for a way to do this
deterministically.
>
> Thanks,
> Shridhar
>
>
> On Wed, 8 Apr 2020 at 02:48, Ilya Dryomov <idryomov(a)gmail.com> wrote:
>>
>> On Tue, Apr 7, 2020 at 6:49 PM Void Star Nill <
void.star.nill(a)gmail.com> wrote:
>> >
>> > Hello All,
>> >
>> > Is there a way to specify that a lock (shared or exclusive) on an rbd
>> > volume be released if the client machine becomes unreachable or
>> > irresponsive?
>> >
>> > In one of our clusters, we use rbd locks on volumes to make sure
provide a
>> > kind of shared or exclusive access -
to make sure there are no
writers when
>> > someone is reading and there are no
readers when someone is writing.
>> >
>> > However, we often run into issues when one of the machines gets into
kernel
>> > panic or something and the whole
pipeline gets stalled.
>>
>> What kind of kernel panics are you running into? Do you have any panic
>> messages or stack traces captured?
>>
>> Thanks,
>>
>> Ilya