Hi Folks,
Fixed the subject line since it still had last week's meeting listed.
:) The performance meeting will be starting in roughly 1 hour! Today
Adam and I will give an update on the investigation into the
bluefs_buffered_io=disabled and omap performance investigation since we
weren't able to get to it last week. Hope to see you there!
Etherpad:
https://pad.ceph.com/p/performance_weekly
Bluejeans:
https://bluejeans.com/908675367
Mark
Hi Folks,
The performance meeting will be starting in roughly 1 hour! Today Adam
and I will give an update on the investigation into the
bluefs_buffered_io=disabled and omap performance investigation since we
weren't able to get to it last week. Hope to see you there!
Etherpad:
https://pad.ceph.com/p/performance_weekly
Bluejeans:
https://bluejeans.com/908675367
Mark
Hi Ceph,
TL;DR: If you have one day a week to work on the next Ceph stable releases[0] your help would be most welcome.
The Ceph stable releases[1] - currently octopus[2] and nautilus[3] - are used by individuals, non-profits, government agencies and companies for their production Ceph clusters. They are also used when Ceph is integrated into larger products, such as hardware appliances. Ceph packages for a range of supported distribution are available at https://ceph.io/. Before the packages for a new stable release are published, they are carefully tested for potential regressions or upgrade problems. The Ceph project makes every effort to ensure the packages published at https://ceph.io/ can be used and upgraded in production.
The Stable release team[4] plays an essential role in the making of each Ceph stable release. In addition to maintaining an inventory of bugfixes that are in various stages of backporting, in most cases we do the actual backporting ourselves. We also run integration tests involving hundreds of machines[5] and analyze the test results when they fail. The developers of the bugfixes only hear from us when we're stuck or to make the final decision whether to merge a backport into the stable branch. Our process is well documented[6] and participating is a relaxing experience (IMHO ;-). Every month or so we have the satisfaction of seeing a new stable release published.
David Galloway (Red Hat), Nathan Cutler (SUSE), Loïc Dachary (Easter-Eggs), Yuri Weinstein (Red Hat), Abhishek Lekshmanan (SUSE) and Adam Kraitman (IBM) are the currently active members of the Stable release team. Their membership is not a lifelong commitment: it is a community effort and the size of the team fluctuates over time. I participated from the beginning until 2017 and returned after over three years of absence. With the release of Pacific scheduled for March 2021, additional work is expected and we would like to invite you to participate. If you're employed by a company using Ceph or doing business with it, maybe your manager could agree to give back to the Ceph community in this way. My employer[7] agreed when I asked: it's worth a try ;-)
Cheers
[0] https://docs.ceph.com/en/latest/releases/general/#active-stable-releases
[1] https://docs.ceph.com/en/latest/releases/general/#lifetime-of-stable-releas…
[2] https://docs.ceph.com/en/latest/releases/octopus/#v15-2-0-octopus
[3] https://docs.ceph.com/en/latest/releases/nautilus/#v14-2-0-nautilus
[4] Ceph Stable releases home page http://tracker.ceph.com/projects/ceph-releases/wiki/HOWTO
[5] Integration tests http://tracker.ceph.com/projects/ceph-releases/wiki/HOWTO_run_integration_a…
[6] https://github.com/ceph/ceph/blob/master/SubmittingPatches-backports.rst
[7] https://easter-eggs.com/
--
Loïc Dachary, Artisan Logiciel Libre
Hi everyone!
I'm excited to announce two talks we have on the schedule for February 2021:
Jason Dillaman will be giving part 2 to the librbd code walk-through.
The stream starts on February 23rd at 18:00 UTC / 19:00 CET / 1:00 PM
EST / 10:00 AM PST
https://tracker.ceph.com/projects/ceph/wiki/Code_Walkthroughs
Part 1: https://www.youtube.com/watch?v=L0x61HpREy4
--------------
What's New in the Pacific Release
Hear Sage Weil give a live update on the development of the Pacific Release.
The stream starts on February 25th at 17:00 UTC / 18:00 CET / 12 PM
EST / 9 AM PST.
https://ceph.io/ceph-tech-talks/
All live streams will be recorded and
--
Mike Perez
Hi all,
In the last weeks the mgr dashboard module has suffered several bugs
related to python package versions:
1) mgr/dashboard: dashboard hangs when accessing it
https://tracker.ceph.com/issues/48973
cheroot 8.5.1 was causing a hang: needed to upgrade to cheroot distro
version 8.5.2
2) mgr/dashboard: ERROR: test_a_set_login_credentials
(tasks.mgr.dashboard.test_auth.AuthTest)
https://tracker.ceph.com/issues/49574
Adapt code to PyJWT version >= 2.0.0 (our tests were relying on version==1.6.4)
3) No possibility to use (in nautilus branch) features from "six"
package version
like "ensure_text" (available in version >=1.12) due to the fact that
distros like ubuntu bionic seem to be stuck on version 1.11
Taking into consideration all the above and the fact that version
pinning/constraint can provide
stability & reliability, e.g.:
https://github.com/psf/requests/blob/master/setup.py#L44https://github.com/tiangolo/fastapi/blob/master/pyproject.toml#L34
PROPOSAL:
Including a virtualenv (or equivalent) in mgr(dashboard) packag(es)
(created during build process)
with some pinned dependencies so we are no longer dependent on system
packages for the specified packages.
We're already doing it for Dashboard Frontend (Angular Single Page
App) npm dependencies:
https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/frontend/…
PROS:
* Stability: no longer exposed to potential bugs coming from
side-effects of third-party decisions,
like distro maintainers package upgrade policy.
* Reliability: code tested against well-known package versions; avoid
the problem of having
disparity of package versions in several distros & distro releases.
* Flexibility: possibility of using packages that right now are not
available on all supported distros
(or required version not available).
CONS:
* Package build size increase: we should measure the new size and
maybe considering an incremental approach:
as a first step, pinning only those packages that are "problematic"
(have high version disparity across distros/releases, those not
available on supported distros, etc).
* Taking care of CVE-related security updates for pinned versions:
1 possibility is tracking the security updates with tools like this:
https://docs.github.com/en/github/managing-security-vulnerabilities/configu…
Let me know any other considerations that we should take into account.
Regards,
--
Alfonso Martínez
Senior Software Engineer, Ceph Storage
Red Hat
Hello Lucian,
After you confirming in private correspondence that, unless there is a
huge demand for it, Server 2008R2 support is unlikely to come soon, I
bit the bullet and tried the windows port on Server 2019. I used both
SES4Win driver as well as Cloudbase Solutions' WNBD driver, compiled
from your repository.
My use case is backing up a bunch of files to an RBD image mounted with
WNBD and formatted as NTFS. The backup is done with robocopy. The file
sizes vary from a few KB to 1-2GB at most, the vast majority being under
10MB however. Total size is about 180GB.
The problem is that after sufficient GB of data have been copied, the
WNBD mounted NTFS volume gets corrupted and copying stops. In case of
SES4Win driver, the copy stopped after about 77GB of data have been
copied. Cloudbase's did a bit better, managing to copy about 120GB.
The NTFS filesystem in the RBD image is rendered unusable. 'rbd unmap'
will fail and resort to forceful unmapping (pardon the vague terms, I
don't write it down the first time and I haven't unmapped it this time
yet, in case you need me do something). Mapping it again succeeds but
Windows sees only a raw, unformated partition. That is to say the
partition table (GPT) survived, but the NTFS filesystem was toasted.
On the last attempt, with Cloudbase storage driver, the failure
manifests at robocopy output like this:
> 100% New File 424404 5.blabla.pdf
> 2021/03/06 00:08:36 ERROR *1393* (0x00000571) Time-Stamping
> Destination Directory
> r:\pf-bak\2021-05-03-18-24-31\docs\dir1\dir2\[rest of path removed.....]
> The disk structure is corrupted and unreadable.
> Waiting 1 seconds... Retrying...
>
> [... more retries, same error message ...]
>
> ERROR: RETRY LIMIT EXCEEDED.
>
> New Dir 5 \\10.207.6.1\c$\publicfolders\[path name
> removed.....]\
> 2021/03/06 00:08:42 ERROR *1392* (0x00000570) Creating Destination
> Directory r:\pf-bak\2021-05-03-18-24-31\docs\dir1\dir2\[rest of path
> removed.....]
> The file or directory is corrupted and unreadable.
> Waiting 1 seconds... Retrying...
>
Errors go on like this, until robocopy starts being able to write again,
on a different path:
> ERROR: RETRY LIMIT EXCEEDED.
>
> 2021/03/06 00:15:35 ERROR 1392 (0x00000570) Creating Destination
> Directory r:\pf-bak\2021-05-03-18-24-31\docs\dir1\dir2\[rest of path
> removed.....]
> The file or directory is corrupted and unreadable.
>
> New Dir 37
> \\10.207.6.1\c$\publicfolders\docs\dir1\*dir3*\
> 100% New File 475357 blabla.pdf
But a bit further, again, errors:
> New Dir 4
> \\10.207.6.1\c$\publicfolders\docs\dir1\dir3\[rest of path removed.....]
> 100% New File 1.1 m blabla.pdf
> 100% New File 1.1 m blabla.pdf
> 100% New File 1.0 m blabla.pdf
> 100% New File 1.0 m blabla.pdf
> New Dir 15 \\10.207.6.1\c$\publicfolders\docs\*dir4*\
> 2021/03/06 01:15:21 ERROR 1392 (0x00000570) Copying NTFS Security to
> Destination Directory r:\pf-bak\2021-05-03-18-24-31\docs\dir4\
> The file or directory is corrupted and unreadable.
>
> New Dir 0 \\10.207.6.1\c$\publicfolders\docs\dir5\
> 2021/03/06 01:15:21 ERROR 1392 (0x00000570) Copying NTFS Security to
> Destination Directory r:\pf-bak\2021-05-03-18-24-31\docs\dir5\
> The file or directory is corrupted and unreadable.
(the errors continue until source directory list is exhausted)
The errors with the SES4Win storage driver where much like the above.
Cloudbase driver version string is 16.59.18.505, dated March 5th, 2021.
I don't have the SES4Win version handy but I believe it is well known.
My Ceph cluster version is 14.2.3.
This is the image being mapped:
rbd image 'grph.publicfolders.backup':
size 500 GiB in 128000 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 5f47936179aab9
block_name_prefix: rbd_data.5f47936179aab9
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
op_features:
flags:
create_timestamp: Thu Mar 4 23:11:25 2021
access_timestamp: Sat Mar 6 13:08:53 2021
modify_timestamp: Sat Mar 6 13:08:52 2021
Please tell me if there is any other info I can provide.
Thanks in advance,
-Kostas
All PRs have been tested and merged.
Outstanding is a fix for the dashboard
https://tracker.ceph.com/issues/49596 (Alfonso, Ernest are working on
it).
We plan to lock nautilus repo and start QE validation for v14.2.7 as
soon as this fix is merged.
Thx
YuriW
Hi,
I'm told help with backports[0] would be welcome. Since I've been out of touch with Ceph for years, it would be a gentle way to reconnect with the codebase. My employer[1] agreed that I spend two days a month contributing on this specific topic. It's not much but I hope it will be enough to get useful things done. That being said... I would not know where to start and I would welcome any advice you may have :-)
Cheers
[0] https://tracker.ceph.com/projects/ceph-releases
[1] https://www.easter-eggs.com/
--
Loïc Dachary, Artisan Logiciel Libre
Hi,
We have been testing ceph-dokan, based on the guide here:
<https://documentation.suse.com/ses/7/single-html/ses-windows/index.html#win…>
And watching <https://www.youtube.com/watch?v=BWZIwXLcNts&ab_channel=SUSE>
Initial tests on a Windows 10 VM show good write speed - around 600MB/s,
which is faster than our samba server.
What worries us, is using the "root" ceph.client.admin.keyring on a
Windows system, as it gives access to the entire cephfs cluster - which
in our case is 5PB.
I'd really like this to work, as it would let user administrated Windows
systems that control microscopes to save data directly to cephfs, so
that we can process the data on our HPC cluster.
I'd normally use cephx, and make a key that allows access to a directory
off the root.
e.g.
[root@ceph-s1 users]# ceph auth get client.x_lab
exported keyring for client.x_lab
[client.x_lab]
key = xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX==
caps mds = "allow r path=/users/, allow rw path=/users/x_lab"
caps mon = "allow r"
caps osd = "allow class-read object_prefix rbd_children, allow rw
pool=ec82pool"
The real key works fine on linux, but when we try this key with
ceph-dokan, and specify the ceph directory (x_lab) as a ceph path, there
is no option to specify the user - is this hard-coded as admin?
Have I just missed something? Or is this a missing feature?
anyhow, ceph-dokan looks like it could be quite useful,
thank you Cloudbase :)
best regards,
Jake
--
Dr Jake Grimmett
Head Of Scientific Computing
MRC Laboratory of Molecular Biology
Francis Crick Avenue,
Cambridge CB2 0QH, UK.
On 4/30/20 1:54 PM, Lucian Petrut wrote:
> Hi,
>
> We’ve just pushed the final part of the Windows PR series[1], allowing
> RBD images as well as CephFS to be mounted on Windows.
>
> There’s a comprehensive guide[2], describing the build, installation,
> configuration and usage steps.
>
> 2 out of 12 PRs have been merged already, we look forward to merging the
> others as well.
>
> Lucian Petrut
>
> Cloudbase Solutions
>
> [1] https://github.com/ceph/ceph/pull/34859
> <https://github.com/ceph/ceph/pull/34859>
>
> [2]
> https://github.com/petrutlucian94/ceph/blob/windows.12/README.windows.rst <https://github.com/petrutlucian94/ceph/blob/windows.12/README.windows.rst>
>
> *From: *Lucian Petrut
> <mailto:/O=CLOUDBASE/OU=EXCHANGE%20ADMINISTRATIVE%20GROUP%20(FYDIBOHF23SPDLT)/CN=RECIPIENTS/CN=LUCIAN%20PETRUT77C>
> *Sent: *Monday, December 16, 2019 10:12 AM
> *To: *dev(a)ceph.io <mailto:dev@ceph.io>
> *Subject: *Windows port
>
> Hi,
>
> We're happy to announce that a couple of weeks ago, we've submitted a
> few Github pull requests[1][2][3] adding initial Windows support. A big
> thank you to the people that have already reviewed the patches.
>
> To bring some context about the scope and current status of our work:
> we're mostly targeting the client side, allowing Windows hosts to
> consume rados, rbd and cephfs resources.
>
> We have Windows binaries capable of writing to rados pools[4]. We're
> using mingw to build the ceph components, mostly due to the fact that it
> requires the minimum amount of changes to cross compile ceph for
> Windows. However, we're soon going to switch to MSVC/Clang due to mingw
> limitations and long standing bugs[5][6]. Porting the unit tests is also
> something that we're currently working on.
>
> The next step will be implementing a virtual miniport driver so that RBD
> volumes can be exposed to Windows hosts and Hyper-V guests. We're hoping
> to leverage librbd as much as possible as part of a daemon that will
> communicate with the driver. We're also aiming at cephfs and considering
> using Dokan, which is FUSE compatible.
>
> Merging the open PRs would allow us to move forward, focusing on the
> drivers and avoiding rebase issues. Any help on that is greatly appreciated.
>
> Last but not least, I'd like to thank Suse, who's sponsoring this effort!
>
> Lucian Petrut
>
> Cloudbase Solutions
>
> [1] https://github.com/ceph/ceph/pull/31981
>
> [2] https://github.com/ceph/ceph/pull/32027
>
> [3] https://github.com/ceph/rocksdb/pull/42
>
> [4] http://paste.openstack.org/raw/787534/
>
> [5] https://sourceforge.net/p/mingw-w64/bugs/816/
>
> [6] https://sourceforge.net/p/mingw-w64/bugs/527/
>
>
> _______________________________________________
> Dev mailing list -- dev(a)ceph.io
> To unsubscribe send an email to dev-leave(a)ceph.io
>
Hi folks,
I'm wondering if there is a plan to create a new build for Ceph Octopus
including [0] (this change merged on Feb 9th). I was hoping that this was
included in the current build available in [1] (which got published last
Feb 23rd).
We need those backports for a feature in OpenStack Manila that we expect to
release in the upcoming weeks. Right now we are testing with Ceph Pacific,
but we need to test with Octopus as well since our CI is aligned with that
version.
Thanks,
Victoria
[0] https://github.com/ceph/ceph/pull/38612
[1] https://download.ceph.com/debian-octopus/