May 2021 - Dev - lists.ceph.io

Ceph Leadership Team meeting 2021-05-26

by Mike Perez

Hi everyone, Highlights from this week: * Ceph User Survey '21: Dashboard analysis * Some responses looked contradictory (responding not using the dashboard and then reporting a 5/5 rating) * Would you recommend dashboard? - average 6.75 - median 8 * Does the dashboard allow you to perform tasks faster than cli? - No 52 - Yes 48 * What's most used in the dashboard - landing page - iscsi and NFS least used - RBD > Cephfs consistent Ceph-wise usage - RGW should rank higher than cephfs 62% vs 54% * Detractors still use the landing page, cluster logs, mon status, and grafana dashboards * Dashboard user Profiles - Haven't used the dashboard yet - Older versions of Ceph Dashboard - Documentation issues? - Getting started doc refers to the CLI - Performance concerns? * monitoring only - Read-only mode - Fear of breaking things? * promoters * CLI junkies - Claim: "CLI is faster, easy to remember and scriptable - Dashboard provides bulk ops and multi-step workflows - API recorder in OpenAttic: https://github.com/openattic/openattic/search?q=recorder - Dashboard maps 1:1 to Ceph CLI * other mgmt tools - Using Ceph-ansible, Puppet, Rook - Other - Proxmox - Nagios * Actions - Focus efforts on the landing page and cluster logs - Increase inline docs and contextual helpers - Add support for bulk ops - Rest API recorder: generate a script from UI actions - Publicize dashboard in Ceph blog, tech talks. * Ceph.io website updates - The working group's ideal launch date is June 24th - Content needed for pages - https://dev.ceph.io/en/ * Market Development Working Group - The group would like to be more connected with the CLT on the roadmap for releases. - they meet once a month. - they're interested in roadmaps, blog posts, - Mike: send out pad for gathering high level features/improvements in Quincy roadmap * NFS questions on dev@ need answers... - ganesha and dir_chunk=0 - problematic for RGW exports * GSOC onboarding / collaboration - ali to organize octopus: subinterpreters again - Planned to be discussed at the next CDM -- Mike Perez

2 years, 11 months

1
0
0 0

Re: ganesha.conf template for nfs-rgw (was Re: getting inconsistent results in nfs-rgw readdir :( )

by Sage Weil

Adding dev list. Jeff, is it okay to remove dir_chunk=0 for the cephfs case? sage On Tue, May 25, 2021 at 7:43 AM Daniel Gryniewicz <dang(a)redhat.com> wrote: > > I think dir_chunk=0 should never be used, even for cephfs. It's not > intended to be used in general, only for special circumstances (an > out-of-tree FSAL asked for it, and we use it upstream for debugging > readdir), and it may go away in a future version of Ganesha. > > The rest is probably okay for both of them. However, this raises some > issues. Some settings, such as dir_chunk=0, Attr_Expiration_Time=0, and > only_numeric_onwers=true are global to Ganesha. This means that, if > CephFS and RGW need different global settings, they'd have to run in > different instances of Ganesha. Is this something we're interested in? > > Daniel > > On 5/25/21 8:11 AM, Sebastian Wagner wrote: > > Moving this to upstream, as this is an upstream issue. > > > > Hi Mike, hi Sage, > > > > Do we need to rethink how we deploy ganesha daemons? Looks like we need > > different ganesha.conf templates for cephfs and rgw. > > > > - Sebastian > > > > Am 25.05.21 um 13:59 schrieb Matt Benjamin: > >> Hi Sebastian, > >> > >> 1. yes, I think we should use different templates > >> 2. MDCACHE { dir_chunk = 0; } is fatal for RGW NFS--it seems suited to > >> avoid double caching of vnodes in the cephfs driver, but simply cannot > >> be used with RGW > >> 3. RGW has some other preferences--for example, some environments > >> might prefer only_numeric_owners = true; Sage is already working on > >> extending cephadm to generate exports differently, which should allow > >> for multiple tenants > >> > >> Matt > >> > >> On Tue, May 25, 2021 at 7:39 AM Sebastian Wagner <sewagner(a)redhat.com> > >> wrote: > >>> Hi Matt, > >>> > >>> This is the ganesha.conf template that we use for both cephfs and rgw: > >>> > >>> https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/templates/s… > >>> > >>> > >>> I have the slight impression that we might need to different templates > >>> for rgw and cephfs? > >>> > >>> Best, > >>> Sebastian > > > > ...snip... > > >

2 years, 11 months

3
3
0 0

Microsoft Office 365 Malware Protection

by robert2doxon＠gmail.com

Microsoft Office 365 takes the security of its services very seriously. With a large customer base worldwide, Microsoft Office 365 can sometimes be at the forefront of cyber-attacks, simply due to the number of people that use it. Let’s take a look at some of the safeguards that are included straight out of the box. Spam is unsolicited and unwanted email. Malware is viruses and spyware. Viruses infect other programs and data, and they spread throughout your computer looking for programs to infect. Spyware is a specific type of malware that gathers your personal information (for example, sign-in information and personal data) and sends it back to the malware author. So you need a powerful antispam, antiphishing and antimalware solution that can be layered on top of Office 365 to improve protection against zero day email attacks. Read more about it here https://www.nakivo.com/microsoft-office-365-backup/microsoft-office-365-mal…

2 years, 11 months

1
0
0 0

Ceph_Librados _API Integration

by FLAME KAISER

Hello Everyone, I am Mishal, Actually, I am Selected in GSoC 2021 and working on a Project With help of Ceph : Here is the link To the Project: https://summerofcode.withgoogle.com/projects/#6366172691300352 I wanted To ask About the direct usage of Librados Library to Integrate The PDC API with Ceph Cluster. If any of you have directly Worked On Librados,Please let me know So, I can Clear My doubts. Thanks Mishal

2 years, 11 months

2
2
0 0

05/20/2021 perf meeting is on!

by Mark Nelson

Hi Folks, The performance meeting will be starting in 20 minutes. Today we will talk about gathering performance statistics for telemetry, buffered IO, and linux sync behavior. Please feel free to add your own topic. Hope you see you there! Etherpad: https://pad.ceph.com/p/performance_weekly Bluejeans: https://bluejeans.com/908675367 Mark

2 years, 11 months

2
1
0 0

octopus repo is locked

by Yuri Weinstein

QE Validation for octopus 15.2.13 is in progress Thx YuriW

2 years, 11 months

1
0
0 0

RHEL8.4 available | Retired 8.0 and 8.1

by David Galloway

RHEL8.4 was GAed Tuesday. We have FOG images for it in the lab now. qa yamls should be updated accordingly. As a side note, I'm going to start cleaning up old versions more aggressively. I had incorrectly assumed nothing was using the RHEL 8.0 or 8.1 images anymore so I deleted those repos from the Satellite server. RHEL 8.0, for example, is over 2 years old. The latest distros we have and should be targeting now are: Ubuntu Focal 20.04 CentOS 8.3 RHEL 8.4 CentOS 8 and 9 Stream -- David Galloway Senior Systems Administrator Ceph Engineering

2 years, 11 months

1
0
0 0

Ceph Leadership Team meeting 2021-05-19

by Sebastian Wagner

Hello, Highlights from this week: - Security fixes: There is a demand for more transparency about fixed security issues in dot releases. we're not using regular PRs for these, thus the regular release note process not working here. - There is a Github feature [2] about managing security issues. Might worth trying it out. Patrick used it in the past year. Didn't worked the way we thought. - Semantic Releases? [3]. But the issue righ now is less about automation, but more about writing the actual note. - When we run the final QA cycle, we could create the release notes PR? David: fine with me. - Ceph-Month in june. Schedule finalized. Let's pre-populate discussion topics. Might help getting people interested - nfs-rgw: We know what we want to go. - branch-dance: David wants to work on the release script. Josh happy to help out Best, Sebastian [1]: https://pad.ceph.com/p/clt-weekly-minutes [2]: https://docs.github.com/en/code-security/security-advisories/collaborating-… ? [3]: https://github.com/semantic-release/semantic-releaseThanks,

2 years, 11 months

1
0
0 0

Re: [ceph-users] rbd-nbd crashes Error: failed to read nbd request header: (33) Numerical argument out of domain

by Mykola Golub

On Wed, May 19, 2021 at 11:32:04AM +0800, Zhi Zhang wrote: > On Wed, May 19, 2021 at 11:19 AM Zhi Zhang <zhang.david2011(a)gmail.com> > wrote: > > > > > On Tue, May 18, 2021 at 10:58 PM Mykola Golub <to.my.trociny(a)gmail.com> > > wrote: > > > > > > Could you please provide the full rbd-nbd log? If it is too large for > > > the attachment then may be via some public url? > > > > ceph.rbd-client.log.bz2 > > <https://drive.google.com/file/d/1TuiGOrVAgKIJ3BUmiokG0cU12fnlQ3GR/view?usp=…> > > > > I uploaded it to google driver. Pls check it out. > > We found the reader_entry thread got zero byte when trying to read the nbd > request header, then rbd-nbd exited and closed the socket. But we haven't > figured out why read zero byte? Ok. I was hoping to find some hint in the log, why the read from the kernel could return without data, but I don't see it. From experience it could happen when the rbd-nbd got stack or was too slow so the kernel failed after timeout, but it looked different in the logs AFAIR. Anyway you can try increasing the timeout using rbd-nbd --timeout (--io-timeout in newer versions) option. The default is 30 sec. If it does not help, probably you will find a clue increasing the kernel debug level for nbd (it seems it is possible to do). -- Mykola Golub

2 years, 11 months

1
0
0 0

backfill_unfound state reset to clean after osd restart

by Mykola Golub

Hi, I would like to bring some attention to a problem we have been observing with nautilus, and which I reported here [1]. If a pg is in backfill_unfound state ("unfound" objects were detected during backfill), and one of the osds from the active set is restarted the state changes to clean, losing the information about unfound objects. And when I tired to reproduce the issue on the master with the same scenario, the status did not change, but I was observing the primary osd crash after a non-primary restart. I looked through the commit log and did not find a commit explicitely saying (or giving a hint) this problem was adressing in the master and I see there was large refactoring in the related code since nautilus. So probably the issue was "solved" during refactoring? We would love to see the problem fixed in the nautilus, and I would like to backport the "fix", but right now I don't have a clear understanding if there really was a fix in the master and what to do with that crash that may be related to the "fix". I might try to find the commit that changed the behaviour by bisecting, but this looks like a long way, so I want to ask here first if anybody has a hint. [1] https://tracker.ceph.com/issues/50351 Thanks, -- Mykola Golub

2 years, 11 months

3
9
1 0

2024

2023

2022

2021

2020

2019

Dev May 2021