Hi list,
let me quickly summarize the current state of the cephadm development
and give some insights in the current list of topics for the near future
and for the pacific release. For an introduction to cephadm, see [0].
With the initial Octopus release, cephadm was already able to deploy the
usual Ceph services. With the upcoming 15.2.2, cephadm will gain some UX
improvements, which make cephadm actually usable now.
In addition, cephadm can also now deploy NFS ganesha.
For the upcoming development, we have four themes where cephadm needs to
enhance:
1. Streamline the user experience
This consists of topics like making the current state of cephadm visible
to the user, improve the responsiveness of the CLI, improve the day 1
experience, improve the official documentation, an improved integration
with the Ceph Dashboard and improvements to particular services (RGW,
MDS, MGR, etc.)
2. Teuthology improvements
This theme contains things like making the dashboard use the cephadm.py
Teuthology task, enabling the RGW test again, making the CephFS test
runner compatible to cephadm, and allowing other Ceph components to base
on the cephadm.py task.
3. High availability.
We'd like to add the capability to cephadm to deploy high available
services. This might contain new services like haproxy, keepalived or
maybe some internal DNS.
4. Resilience
This contains items like adding an MGR thrasher to Teuthology and making
all commands of cephadm idempotent.
In addition to those four themes, there are also other topics as well:
* the current scheduler essentially randomizes the placement of daemons
across the cluster. There are certainly many improvements possible
here.
* It would be great to enhance cstart (similar to vstart, but with
containers) and thus improve the developer experience
* Enhance the compatibility to Rook (daemon placements, upgrades of
Ceph and OSDs management)
For more info, please have a look at the CDS a few weeks back: [1]
And if you'd like to get involved, feel free to join our weekly
orchestrator sync meeting on Monday or have a look at our tracker [2]
Best,
Sebastian
[0]: https://ceph.io/ceph-management/introducing-cephadm/
[1]: https://pad.ceph.com/p/orchestrator-pacific
[2]: https://tracker.ceph.com/projects/orchestrator/issues
--
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg). Geschäftsführer: Felix Imendörffer
Hi,
Are there any more resources of unit tests for CRUSH algorithm other than
the test cases here: :
https://github.com/ceph/ceph/tree/master/src/test/crush
Or more unit testing of CRUSH apart from the these test cases would be an
overkill?
BR
Bobby !
Hi Folks,
The weekly perf meeting is on today and starting in about 20 minutes.
Topics today include Neha's sepia lab teuthology/cbt ceph performance
experiments. Please fee free to add your own topics as well!
Etherpad:
https://pad.ceph.com/p/performance_weekly
Bluejeans:
https://bluejeans.com/908675367
Thanks,
Mark
Hi folks, it seems we've covered everything in the recent CDS Pacific,
so we'll cancel today's CDM.
Next time (June 3rd) we'll discuss cephadm and upgrades - how to handle
upgrade ordering, whether it could be enforced for non-containerized
ceph, etc. More topics can be added here:
https://tracker.ceph.com/projects/ceph/wiki/CDM_03-JUN-2020
Josh
Hi,
I reported the case in:
https://tracker.ceph.com/issues/45390
Problem is that with the serialized OSDMap the resulting buffers
are not equal. The code for that:
from test_compression.cc
bufferlist orig;
......
| o->decode(orig);|
| bufferlist fbl;|
| o->encode(fbl, o->get_encoding_features() | CEPH_FEATURE_RESERVED);|
||
And when I JSON-dump the object, they are equal.
So now the question is, which of the 2 tests is the essential one?
Is it oke for for the bufferlists to be different, but the JSON data
is equal?
--WjW
Hi all,
Ceph documentation mentions it has two types of tests: *unit tests* (also
called make check tests) and *integration tests*. Strictly speaking, the *make
check tests* are not “unit tests”, but rather tests that can be run easily
on a single build machine after compiling Ceph from source .
unit tests: https://github.com/ceph/ceph/tree/master/src/test
In order to develop on ceph, I am using a Ceph utility, *vstart.sh*, which
allows me to deploy fake local cluster for development purpose. I am doing
unit testing. And these tests are helping me. Thanks !
My question: How real and big is the workload of unit tests? Are these
tests enough for profiling function calls count, loop counts,
parallelism to a good extent?
Thanks in advance !
BR
Bobby !
Hello everyone,
I m trying to implement tracing in radosgw using jaeger, in tracing we must
have jaeger tracer variable which contains `spans` which gives us the time
taken by each function, and is shown in UI.
Now to make traces more readable to user I m assigning different jaeger
tracer variable to different request the users make that is one jaeger
tracer variable for for `swift -A http://localhost:8000/auth -U test:tester
-K testing list`,
and one for this
`swift -A http://localhost:8000/auth -U test:tester -K testing list`
and each of those tracer contains operation related spans related to that
request only making it more readable in the UI.
To define a jaeger_tracer variable we need a string, so to make the string
unique I m thinking of making string from command line request like `upload
mycontainer file.txt` and append it with current time.
But the problem is I m not able to get command line request string is there
anyway to access it in rgw_process.cc
I have access to these variables
RGWHandler_REST * const handler,
RGWOp *& op,
RGWRequest * const req,
req_state * const s,
Is there anyway I can get the command line request string from these
variables or using some combination of these
Sebastian,
I tried to run the backport-create-issue script:
$ python3 src/script/backport-create-issue 44826
WARNING:root:Missing issues will be created in Backport tracker of the
relevant Redmine project
INFO:root:Redmine key was read from '~/.redmine_key'; using it
INFO:root:Processing issue list ->44826<-
INFO:root:Processing 1 issues with status Pending Backport
INFO:root:https://tracker.ceph.com/issues/44826 skipped because the
project Orchestrator does not have a Backport tracker
INFO:root:Processed 1 issues
and got the above error. It seems batch backports are being done
instead [1] manually. I couldn't find an explanation for this policy
in the email archives. Why is the orchestrator being treated
differently from the rest of the Ceph project?
[1] https://github.com/ceph/ceph/pull/34438
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
Hi Cephers,
I am working on Ceph librados. Currently I can test sequential/synchronous
read and write tests both in C and C++. However I am struggling with
asynchronous/non-sequential test codes. Are there any test repositories
which contain asynchronous/non-sequential examples codes?
Thanks in advance
Bobby !
Hello Everyone,
I m trying to upload a file to radosgw cluster, in the time of uploading no
error is shown in the terminal, but when I try to recover that object error
pops in my terminal
`http://localhost:8000/swift/v1/{containername}/{filename} error internal
server error:Unknow error`
I checked my log file (radosgw.8000.log) to check if any error has
occurred while upload operation
I found this
```
2020-05-03T05:08:45.617+0530 7f8e21de3700 2 req 66 1.623986488s
swift:put_obj completing
2020-05-03T05:08:45.617+0530 7f8e215e2700 2 req 66 1.623986488s
swift:put_obj op status=1900
2020-05-03T05:08:45.617+0530 7f8e215e2700 2 req 66 1.623986488s
swift:put_obj http status=201
2020-05-03T05:08:45.617+0530 7f8e215e2700 1 ====== req done
req=0x7f8fc444e7b0 op status=1900 http_status=201 latency=1.623986488s
======
2020-05-03T05:08:45.729+0530 7f8fa7fff700 1 -- 192.168.0.103:0/2076586408
<== osd.2 v2:192.168.0.103:6818/7818 1385 ==== osd_op_reply(3502
.dir.d7a4eb34-74e1-49fb-91ef-b8aa666f2576.4470.1.3 [call,call] v270'10 uv10
ondisk = 0) v8 ==== 236+0+0 (crc 0 0 0) 0x7f8f9403d020 con 0x5582360527e0
2020-05-03T05:08:45.765+0530 7f8e22de5700 20 failed to read header: end of
stream
2020-05-03T05:08:45.765+0530 7f8e24de9700 20 failed to read header: end of
stream
```
swift_put obj is showing status 201 which means upload is successful but
then also `failed to read header:end of stream` is there which means some
error has occured.I cannot undestand why I m getting this?
ubuntu : 18.04
ceph - ceph version 15.1.0-1872-g1aa2724a36
(1aa2724a3612787b9e392de648b82a6cbd7ac222) octopus (rc)