On Wed, Nov 13, 2019 at 8:46 AM Sage Weil <sweil(a)redhat.com> wrote:
On Wed, 13 Nov 2019, Paul Cuzner wrote:
Hi Sage,
So I tried switching out the udev calls to pyudev, and shaved a whopping
1sec from the timings..Looking deeper I found that the issue is related to
*ALL* process.Popen calls (of which there are many!) - they all use
close_fds=True.
My suspicion is that when running in a container the close_fds sees fd's
from the host too - so it tries to tidy up more than it should. If you set
ulimit -n 1024 or something and then try a ceph-volume inventory, it should
just fly through! (at least it did for me)
Let me know if this works for you.
Yes.. that speeds of significantly! 1.5s -> .2s in my case. I can't say
that I understand why, though... it seems like ulimit -n will make file
open attempts fail, but I don't see any failures.
Can we drop the close_fds arg?
We shouldn't, because there is a risk of system calls hanging. In
recent Python 3 versions it actually defaults to True.
The Python docs explain that: "If close_fds is true, all file
descriptors except 0, 1 and 2 will be closed before the child process
is executed. "
This is particularly useful if dealing with different system calls
that are sending STDIN (as ceph-volume does) and run subsequent calls.
That mix of calls
can *potentially* cause hangs. This "hanging" behavior was common in
ceph-deploy for example, until this option was used.
sage
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io