On Mon, Dec 2, 2019 at 12:48 PM Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote:
Hi Ilya,
ISTR there were some anti-spam measures put in place. Is your account
waiting for manual approval? If so, David should be able to help.
Yes if I remember correctly I get waiting approval when I try to log in.
Dec 1 03:14:36 c04 kernel: ceph: build_snap_context 100020c9287
ffff911a9a26bd00 fail -12
Dec 1 03:14:36 c04 kernel: ceph: build_snap_context 100020c9283
It is failing to allocate memory. "low load" isn't very specific,
can you describe the setup and the workload in more detail?
4 nodes (osd, mon combined), the 4th node has local cephfs mount, which
is rsync'ing some files from vm's. 'low load' I have sort of test setup,
going to production. Mostly the nodes are below a load of 1 (except when
the concurrent rsync starts)
How many snapshots do you have?
Don't know how to count them. I have script running on a 2000 dirs. If
one of these dirs is not empty it creates a snapshot. So in theory I
could have 2000 x 7 days = 14000 snapshots.
(btw the cephfs snapshots are in a different tree than the rsync is
using)
Is there a reason you are snapshotting each directory individually
instead of just snapshotting a common parent?
If you have thousands of snapshots, you may eventually hit a different
bug:
Be aware that each set of 512 snapshots amplify your writes by 4K in
terms of network consumption. With 14000 snapshots, a 4K write would
need to transfer ~109K worth of snapshot metadata to carry itself out.
Thanks,
Ilya