Hi Folks,
Meeting canceled for today, but work is on-going regarding bluestore
cache triming, RGW improvments, and crimson. We may have a couple of
updates to report next week.
Thanks,
Mark
Dear Community,
In several occasions I ran into the need of trying to build my code in
different OS environments. So, instead of spinning VMs, I decided to
try containerized builds. Some tweaks were needed in the Docker files,
but overall I managed to build the latest master against centos7,
ubuntu18, and fedora31. Feel free to try it out from here [1].
Note that your local code is mounted - no cloning inside the
container. Just the "build" directory is inside the container (so it
can be done in parallel to your regular "build" dir).
Also, note that centos 8 does not work yet :-(
Yuval
[1] https://github.com/yuvalif/ceph-builder
Hi all,
Does anyone meet with below problem when building Ceph master code
into RPM package? The head commit id is:
commit bf09a04d2275de2143feeb63f5e125396c7d4a72
Merge: e190825 7d2bd68
Author: Sage Weil <sage(a)redhat.com>
Date: Wed Oct 23 19:46:06 2019 -0500
Build fail problem log:
extracting debug info from
/home/rdma/rpmbuild/BUILDROOT/ceph-14.0.0-16421.gf71d2b3.el7.x86_64/usr/bin/ceph-mon
extracting debug info from
/home/rdma/rpmbuild/BUILDROOT/ceph-14.0.0-16421.gf71d2b3.el7.x86_64/usr/bin/ceph-osd
/usr/lib/rpm/find-debuginfo.sh: line 127: 1334864 Bus error
(core dumped) eu-strip --remove-comment $r $g -f "$1" "$2"
error: Bad exit status from /var/tmp/rpm-tmp.0sqY3a (%install)
B.R.
Changcheng
I'm trying to implement MDS daemon management for mgr/ssh and am
confused by the intent of the orchestrator interface.
- The add_mds() method takes a 'spec' StatelessServiceSpec that has
a ctor like
def __init__(self, name, placement=None, count=None):
but it is constructed only with a name:
@_write_cli('orchestrator mds add',
"name=svc_arg,type=CephString",
'Create an MDS service')
def _mds_add(self, svc_arg):
spec = orchestrator.StatelessServiceSpec(svc_arg)
That means count=1 and placement is unspecified. That's fine for Rook,
sort of, as long as you want exactly 1 MDS for each file system.
- Given that, can we rename the 'svg_arg' arg to 'name'?
- The 'name' here, IIUC, is the name of the grouping of daemons. I think
it was intended to be a file system, as per the docs:
The ``name`` parameter is an identifier of the group of instances:
* a CephFS file system for a group of MDS daemons,
* a zone name for a group of RGWs
but IIRC the new CephFS behavior is that all standby daemons go into the
same pool and are doled out to file systems that need them arbitrarily.
In that case, I think the only thing we would want to specify (in the rook
case where we don't pick daemon location) is the count of MDSs... and
then have a singel name grouping. Is that right for CephFS? I have a
feeling it won't work for the other daemon types, though, like NFS
servers, which *do* care what they are serving up.
- For SSH, none of that works, since we need to pass a location when
adding daemons. It seems like we want somethign closer to nfs_add,
which is
@_write_cli('orchestrator nfs add',
"name=svc_arg,type=CephString "
"name=pool,type=CephString "
"name=namespace,type=CephString,req=false",
'Create an NFS service')
i.e.,
* 'add' takes a 'name' (the actual daemon name) and a location (if the
orch needs it).
* 'rm' takes the same name and removes it.
* 'update' does the smarts of adding ($want - $have) daemons for a
given group and generating names for them. Something else organizes these
into groups (a common name prefix?). I.e., 'update' basically builds on
'add' and 'rm'.
And/or, we introduce some basic scheduling into ssh orchestrator (or
orchestrator_cli). I'm not sure this is actually that smart since we can
probably get away with something quite simple: round-robin assignment of
daemons to hosts, and the ability to label nodes for a daemon type or
daemon type + grouping. This would basically give ssh orch what ansible
does as far as mapping out the deployment, and gracefully degrade to
something that "just works" (well enough) when you don't know/care
where things land. Obviously having a real scheduler like that in k8s
do this is better, but for non-kube deployments, there is still a need for
placing daemons to hosts to make things easy for the human operator.
sage
Hi all,
About a year ago I started a weekly meeting focused on Ceph testing:
https://calendar.google.com/calendar/event?eid=MmxkMWZubzg2Zzg0a3VjaDJocGU4…
It meets every Wednesday at 8AM Pacific Time and has mostly focused on
teuthology and the qa suites. We've gotten some nice stuff done
through the meeting around teuthology and getting the Suse fork merged
into upstream, but things have mostly stalled out since then. I know
lots of other things are happening outside of that meeting, but I
haven't advertised it in a while and several people were or would be
involved have time conflicts.
So I'd like to see if there are other times that are good for people
to meet to try and coordinate some of those efforts more broadly, or
if there are other meetings we should piggy-back off of and expand to
try and get a broader and more stable audience. Things I know about:
* downgrade testing (Josh and Yuri?)
* several efforts around Kubernetes
* getting openSUSE into the upstream test matrix
* Building and testing containers (Dan+)
I've added some people I know are working on things or whose input
would be appreciated by others on the list, so please discuss! :)
-Greg
On 15:09 Tue 22 Oct, Shawn Chen wrote:
> Hi,changcheng, I wanna know if this problem has been resolved since I
> met this problem too~
> wish your reply, thanks.
I think there's one serious memory leak in Ceph/RDMA/iWARP implementation: The posted allocated outstanding work-request were not reclaimed.
The problem should be fixed in below PR:
https://github.com/ceph/ceph/pull/29947
You can use below master version to do verfication:
commit 9068510305
--Thanks
Changcheng
>
> ---------- Forwarded message ---------
> 发件人: Liu, Changcheng <changcheng.liu(a)intel.com>
> Date: 2019年5月23日周四 下午1:10
> Subject: Re: msg/async/rdma: out of buffer/memory
> To: Roman Penyaev <rpenyaev(a)suse.de>
> Cc: <ceph-devel(a)vger.kernel.org>, <ceph-devel-owner(a)vger.kernel.org>
>
>
> Hi Roman,
> I'll check it and feedback to you later. (Something wrong with
> servers, I have to fix it first)
>
> B.R.
> Changcheng
Hi, everyone.
It seems that spdk's event framework also provide a similar high
performance application framework, why the community choose seastar
over spdk event framework? Thanks.
What's the oldest Ceph release we plan to support on CentOS 8?
I ask because I was checking how well the CentOS 8 dev builds were going
(surprisingly well!) and saw luminous builds were failing because it
looks like the spec file needs updating to support installing python3
packages.
If we don't plan on doing luminous, we can save some time and system
resources by blacklisting CentOS 8 for luminous in the CI.
--
David Galloway
Systems Administrator, RDU
Ceph Engineering
IRC: dgalloway
On Thu, Oct 17, 2019 at 2:50 AM Paul Emmerich <paul.emmerich(a)croit.io> wrote:
>
> On Thu, Oct 17, 2019 at 12:17 AM Robert LeBlanc <robert(a)leblancnet.us> wrote:
> >
> > On Wed, Oct 16, 2019 at 2:50 PM Paul Emmerich <paul.emmerich(a)croit.io> wrote:
> > >
> > > On Wed, Oct 16, 2019 at 11:23 PM Robert LeBlanc <robert(a)leblancnet.us> wrote:
> > > >
> > > > On Tue, Oct 15, 2019 at 8:05 AM Robert LeBlanc <robert(a)leblancnet.us> wrote:
> > > > >
> > > > > On Mon, Oct 14, 2019 at 2:58 PM Paul Emmerich <paul.emmerich(a)croit.io> wrote:
> > > > > >
> > > > > > Could the 4 GB GET limit saturate the connection from rgw to Ceph?
> > > > > > Simple to test: just rate-limit the health check GET
> > > > >
> > > > > I don't think so, we have dual 25Gbp in a LAG, so Ceph to RGW has
> > > > > multiple paths, but we aren't balancing on port yet, so RGW to HAProxy
> > > > > is probably limited to one link.
> > > > >
> > > > > > Did you increase "objecter inflight ops" and "objecter inflight op bytes"?
> > > > > > You absolutely should adjust these settings for large RGW setups,
> > > > > > defaults of 1024 and 100 MB are way too low for many RGW setups, we
> > > > > > default to 8192 and 800MB
> > > >
> > > > On Nautilus the defaults already seem to be:
> > > > objecter_inflight_op_bytes 104857600
> > > > default
> > > = 100MiB
> > >
> > > > objecter_inflight_ops 24576
> > > > default
> > >
> > > not sure where you got this from, but the default is still 1024 even
> > > in master: https://github.com/ceph/ceph/blob/4774808cb2923f65f6919fe8be5f98917075cdd7/…
> >
> > Looks like it is overridden in
> > https://github.com/ceph/ceph/blob/4774808cb2923f65f6919fe8be5f98917075cdd7/…
>
> you are right, this is new in Nautilus. Last time I had to play around
> with these settings was indeed on a Mimic deployment.
>
> > I'm just not
> > understanding how your suggestions would help, the problem doesn't
> > seem to be on the RADOS side (which it appears your tweaks target),
> > but on the HTTP side as an HTTP health check takes a long time to come
> > back when a big transfer is going on.
>
> I was guessing a bottleneck on the RADOS side because you mentioned
> that you tried both civetweb and beast, somewhat unlikely to run into
> the exact same problem with both
Looping in ceph-dev in case they have some insights into the inner
workings that may be helpful.
From what I understand civitweb was not async and beast is, but if
beast is not coded exactly right, then it could behave similarly as
civitweb.
It seems that with beast incoming requests are being assigned to BEAST
threads and possibly it is doing as sync call to rados therefore
blocking requests behind it until the RADOS call is completed. I tried
looking through the code, but I'm not familiar with async in C++. I
could see two options that may resolve this. First, have a seperate
thread pool for accessing RADOS objects with a queue that BEAST
dispatches to and callback the completion at the end. The second
option is creating async RADOS calls so that it can yield the event
loop to another RADOS task. I couldn't tell if either one of these are
being done, but that should help small IO not get stuck behind large
IO.
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
How do I find the number of directory fragments and possibly their contents
?
How do I find the number of stray directories being created and possibly
their contents ?
Are there any commands that help get answers to the above questions?
--
Milind