I updated teuthology yesterday and since then have seen a log of the
following errors
...src/teuthology/virtualenv/local/lib/python2.7/site-packages/paramiko/ecdsakey.py:164:
CryptographyDeprecationWarning: Support for unsafe construction of
public numbers from encoded data will be removed in a future version.
Please use EllipticCurvePublicKey.from_encoded_point
self.ecdsa_curve.curve_class(), pointinfo
2019-07-31 01:45:18,976.976 ERROR:paramiko.transport:Exception: Error
reading SSH protocol banner
2019-07-31 01:45:18,976.976 ERROR:paramiko.transport:Traceback (most
recent call last):
2019-07-31 01:45:18,976.976 ERROR:paramiko.transport: File
"/home/bhubbard/src/teuthology/virtualenv/local/lib/python2.7/site-packages/paramiko/transport.py",
line 1966, in run
2019-07-31 01:45:18,976.976 ERROR:paramiko.transport: self._check_banner()
2019-07-31 01:45:18,977.977 ERROR:paramiko.transport: File
"/home/bhubbard/src/teuthology/virtualenv/local/lib/python2.7/site-packages/paramiko/transport.py",
line 2143, in _check_banner
2019-07-31 01:45:18,977.977 ERROR:paramiko.transport: "Error
reading SSH protocol banner" + str(e)
2019-07-31 01:45:18,977.977 ERROR:paramiko.transport:SSHException:
Error reading SSH protocol banner
Sometimes these are fatal and sometimes not. Wondering if anyone else
has seen them?
--
Cheers,
Brad
David G,
I've been looking over logs and ceph pg dump pgs on the LRC and things
look good to me. If you see anything not working file a tracker or if
you have any questions please contact me.
There is one thing that you should be aware of. There are still
filestore objectstores for some of the OSDs. The auto_repair feature is
not supported for filestore, so when they deep-scrub they won't repair.
With auto_repair enabled in this mixed cluster the LRC will auto_repair
if the primary OSD for a PG is bluestore even if some replicas are
filestore. So I would convert the the remaining filestore OSDs to
bluestore. If you are paranoid you should disable auto_repair until the
conversion is completed.
David Z
On 7/2/19 3:11 PM, David Zafman wrote:
>
> I don't see that now in ceph status. A pg's deep scrub would have to
> be over 5 days overdue for that warning to occur.
>
> David
>
> On 7/2/19 2:29 PM, David Galloway wrote:
>> This build is installed now.
>>
>> It looks like "1 pgs not scrubbed in time" is back.
>>
>> On 6/28/19 12:27 PM, David Zafman wrote:
>>> David,
>>>
>>> I have a new scrub handling code built for Nautilus. Could we
>>> install this on the LRC to see how well it works in a more realistic
>>> environment?
>>>
>>> https://shaman.ceph.com/builds/ceph/wip-zafman-testing-nautilus/31ff31f2c8d…
>>>
>>>
>>>
>>> Thanks
>>>
>>> David Zafman
>>>
Hi everyone,
I'd really like to push this through and get something working this week.
After poking around it seems clear that there aren't any other registries
we should be using for our temp/test builds (unless we just spam
dockerhub, but that seems unwise). So, let's just get over ourselves and
run our own registry.
1- Where to put it? I'm assuming this should go in the same place that
chacra is putting our other temp builds. This is in RDU, right? What
machine(s) should we use? If we use the same retention policy as the
debs/rpms then this will be an incremental increase in the storage needed.
2- What registry software to use? We don't need any fancy featuers
whatsoever--just the ability to push, pull, and delete images. So,
whatever is easiest to set up.
3- Jenkins integration. I think we need to have a child job linked to the
centos build to do the ceph-container build and then push. Similarly,
whatever it is that removes the old packages from the repos needs to also
delete the image.
4- Chacra/shaman integration? Should the container build show up in
shaman/chacra as well? Is there extra work needed to do that?
Thanks!
sage
Hi Alfredo,
Am 30.04.19 um 15:00 schrieb Alfredo Deza:
> On Tue, Apr 30, 2019 at 8:52 AM Sebastian Wagner
> <sebastian.wagner(a)suse.com> wrote:
>>
>> All,
>>
>> I've been working on exercising some Rook orchestrator commands in an
>> automated fashion (like deploying Ceph services). The concept itself
>> works pretty well and now I'd like to integrate this into Sepia.
>>
>> A part of this endeavor was to set up an empty Kubernetes cluster using
>> local VMs and Terraform. As Sepia already runs a k8s cluster, it might
>> make sense to just use this existing cluster, instead of creating a new
>> cluster for every test run. One downside of re-using existing clusters
>> is: Only one Test run can access a given cluster at a time and thus
>> eliminating some possible parallelism.
>>
>> There is another bummer: As far as I know, we're not building Ceph
>> container images for Ceph PRs and https://hub.docker.com/r/ceph/ceph
>> only contains stable Nautilus images. Testing Ceph images automatically
>> after they're released to the public isn't going to fly.
>>
>> Are there any plans to build Ceph container images in Shaman or from
>> within Jenkins Jobs?
>
> This has been discussed in the past, but it is a tremendous effort
> which has many moving pieces.
Indeed, it is.
On the other hand, if we want to make container first-class citizens, is
not building them really a viable option?
> One of them is where to store the
> container images - I don't think it is OK to push
> to hub.docker.com since we build about 400 repositories per day.
Actually this would be a perfect use case for a private registry.
>
>>
>> Or asked in a different way: Are there any automatically build Octopus
>> container images?
>
> There isn't anything for any release at the moment.
Thanks for the clarification.
>>
>> Best,
>> Sebastian
>>
>>
>>
>>
>>
>>
>> --
>> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany
>> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
>
--
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
Am 30.04.19 um 15:34 schrieb Alfredo Deza:
> On Tue, Apr 30, 2019 at 9:27 AM Sebastian Wagner
> <sebastian.wagner(a)suse.com> wrote:
>>
>> Hi Alfredo,
>>
>> Am 30.04.19 um 15:00 schrieb Alfredo Deza:
>>> On Tue, Apr 30, 2019 at 8:52 AM Sebastian Wagner
>>> <sebastian.wagner(a)suse.com> wrote:
>>>>
>>>> All,
>>>>
>>>> I've been working on exercising some Rook orchestrator commands in an
>>>> automated fashion (like deploying Ceph services). The concept itself
>>>> works pretty well and now I'd like to integrate this into Sepia.
>>>>
>>>> A part of this endeavor was to set up an empty Kubernetes cluster using
>>>> local VMs and Terraform. As Sepia already runs a k8s cluster, it might
>>>> make sense to just use this existing cluster, instead of creating a new
>>>> cluster for every test run. One downside of re-using existing clusters
>>>> is: Only one Test run can access a given cluster at a time and thus
>>>> eliminating some possible parallelism.
>>>>
>>>> There is another bummer: As far as I know, we're not building Ceph
>>>> container images for Ceph PRs and https://hub.docker.com/r/ceph/ceph
>>>> only contains stable Nautilus images. Testing Ceph images automatically
>>>> after they're released to the public isn't going to fly.
>>>>
>>>> Are there any plans to build Ceph container images in Shaman or from
>>>> within Jenkins Jobs?
>>>
>>> This has been discussed in the past
>>>, but it is a tremendous effort
>>> which has many moving pieces.
>>
>> Indeed, it is.
>>
>> On the other hand, if we want to make container first-class citizens, is
>> not building them really a viable option?
>
> I agree with you here, we should be building containers, regardless of
> how many repositories we produce a day
>
>>
>>> One of them is where to store the
>>> container images - I don't think it is OK to push
>>> to hub.docker.com since we build about 400 repositories per day.
>>
>> Actually this would be a perfect use case for a private registry.
>
> I agree again here. Would love to see if it there was a
> community-based effort for a registry so we could push images. As it
> stands right now, our very small team can't possibly
> take on running/maintaining another service, much less provide for the
> tremendous amount of infrastructure needed.
Out of my head, I could think of two alternatives that don't require any
new services:
1. Maybe Docker hub could build a nightly image from latest master
https://shaman.ceph.com/api/repos/ceph/master/latest/
using a static Dockerfile using a setup described at
https://docs.docker.com/docker-hub/builds/
This wouldn't give us tests for PRs, though.
2. Or alternatively, Jenkins could build images locally without pushing
them anywhere. (I'm not a big fan of this, as it would require a
temporary private container registry while executing the test.)
>
>
>>
>>>
>>>>
>>>> Or asked in a different way: Are there any automatically build Octopus
>>>> container images?
>>>
>>> There isn't anything for any release at the moment.
>>
>> Thanks for the clarification.
>>
>>>>
>>>> Best,
>>>> Sebastia
--
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
All,
I've been working on exercising some Rook orchestrator commands in an
automated fashion (like deploying Ceph services). The concept itself
works pretty well and now I'd like to integrate this into Sepia.
A part of this endeavor was to set up an empty Kubernetes cluster using
local VMs and Terraform. As Sepia already runs a k8s cluster, it might
make sense to just use this existing cluster, instead of creating a new
cluster for every test run. One downside of re-using existing clusters
is: Only one Test run can access a given cluster at a time and thus
eliminating some possible parallelism.
There is another bummer: As far as I know, we're not building Ceph
container images for Ceph PRs and https://hub.docker.com/r/ceph/ceph
only contains stable Nautilus images. Testing Ceph images automatically
after they're released to the public isn't going to fly.
Are there any plans to build Ceph container images in Shaman or from
within Jenkins Jobs?
Or asked in a different way: Are there any automatically build Octopus
container images?
Best,
Sebastian
--
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
Jos just messaged me about this issue on IRC. Seems maybe it's not user
error?
Kefu, do you know what's going on here? I only ask you specifically
because of this PR: https://github.com/ceph/ceph/pull/23411
-------- Forwarded Message --------
Subject: Re: Teuthology failures
Date: Tue, 9 Apr 2019 23:50:12 +0530
From: Sidharth Anupkrishnan <sanupkri(a)redhat.com>
To: Patrick Donnelly <pdonnell(a)redhat.com>
CC: David Galloway <dgallowa(a)redhat.com>, Kefu Chai <kchai(a)redhat.com>
sorry attached the wrong logs. The latest logs(against the master)
are: http://pulpito.ceph.com/sidharthanup-2019-04-09_16:33:46-multimds-master-di…
On Tue, Apr 9, 2019 at 10:44 PM Sidharth Anupkrishnan
<sanupkri(a)redhat.com <mailto:sanupkri@redhat.com>> wrote:
I tried testing against master also, still fails.. Here's the
logs: http://pulpito.ceph.com/sidharthanup-2019-04-09_12:36:20-multimds-nautilus-…
The teuthology-suite command used was:
teuthology-suite --machine-type smithi --distro rhel -D 7.6 --flavor
default --email sanupkri(a)redhat.com <mailto:sanupkri@redhat.com> -p
9 --suite multimds --ceph master -n 5 --ceph-repo
https://github.com/ceph/ceph.git --suite-branch
wip-dir-pin-attribute-fail --suite-repo
https://github.com/sidharthanup/ceph.git --filter test_exports --limit 2
The same qa runs against nautilus were working yesterday. Only today
am i faced with this error.
On Tue, Apr 9, 2019 at 9:46 PM Patrick Donnelly <pdonnell(a)redhat.com
<mailto:pdonnell@redhat.com>> wrote:
A cursory look suggests the problem is just that Sidharth is
testing a
master QA suite against Nautilus.
@Sidharth Anupkrishnan instead do:
teuthology-suite ... --ceph-repo https://github.com/ceph/ceph.git
--ceph <https://github.com/ceph/ceph.git--ceph> master -n 5
-n 5 allows teuthology-suite to look back 5 merges to look for the
most recent version of master that has finished building. You
can also
use -S <sha1> to pick a recent merge commit that has completed
building.
On Tue, Apr 9, 2019 at 6:55 AM David Galloway
<dgallowa(a)redhat.com <mailto:dgallowa@redhat.com>> wrote:
>
> Hey Sidharth,
>
> I looked at the repo that was built for the version of Ceph you're
> testing there and it looks like the package name is actually
> python36-cephfs.
>
> See
>
https://2.chacra.ceph.com/r/ceph/nautilus/c09e90d1847fc4ffdd7384c9adf7f60c1…
>
> So I went and looked at the code in teuthology and I *think*
the package
> lists it uses to install packages is provided in the ceph.git repo
> somewhere.
>
> I did find this which adds python3-cephfs:
> https://github.com/ceph/ceph/pull/23411
>
> But you'll see here, teuthology renames it to 'python34-cephfs':
>
https://github.com/ceph/teuthology/blob/master/teuthology/task/install/rpm.…
>
> SO... I'm not sure where exactly to fix this. If we're not
supposed to
> be building a python36-cephfs package, I guess that'd get
fixed in the
> spec file?
>
> If we are supposed to be building a package called
python36-cephfs, then
> teuthology needs to be patched.
>
> I've CC'ed Kefu and Patrick so they can take a look and suggest a
> resolution.
>
> On 4/9/19 8:28 AM, Sidharth Anupkrishnan wrote:
> > Hey David!
> >
> > I've run into some errors while testing
> > :
http://pulpito.ceph.com/sidharthanup-2019-04-09_11:17:44-multimds-nautilus-…
> > . Seems the smithi machines cannot install python34-cephfs.
Any idea why ?
> >
> > Regards,
> > Sidharth Anupkrishnan
-- Patrick Donnelly
pdonnell@senta02 ~/ceph$ git fetch --all
Fetching origin
Fetching upstream
From https://github.com/ceph/ceph
x [deleted] (none) -> upstream/pull/26516/merge
error: RPC failed; curl 18 transfer closed with outstanding read data remaining
fatal: The remote end hung up unexpectedly
error: Could not fetch upstream
It's been oddly failing like this for the last two days. Could this be
a firewall/IDS issue?
--
Patrick Donnelly