- Dev - lists.ceph.io

07/04/2019 perf meeting is canceled

by Mark Nelson

Hi Folks, July 4th is a US Holiday so the perf meeting will be canceled. Have a good week all! Thanks, Mark

4 years, 10 months

1
0
0 0

per client mds throttle

by Dan van der Ster

Hi, Are there any plans to implement a per-client throttle on mds client requests? We just had an interesting case where a new cephfs user was hammering an mds from several hosts. In the end we found that their code was doing: while d=getafewbytesofdata(): f=open(file.dat) f.append(d) f.close() By changing their code to: f=open(file.dat) while d=getafewbytesofdata(): f.append(d) f.close() it completely removes their load on the mds (for obvious reasons). In a multi-user environment it's hard to scrutinize every user's application, so we'd prefer to just throttle down the client req rates (and let them suffer from the poor performance). Thoughts? Thanks, Dan

4 years, 10 months

4
6
0 0

Octopus release target: March 1 2020

by Sage Weil

Hi everyone, The target release date for Octopus is March 1, 2020. The freeze will be January 1, 2020. As a practical matter, that means any features need to be in before people leave for the holidays, ensuring the features get in in time and also that we can run tests over the holidays while the test lab is relatively idle. We plan to stick to a 12 month cadence going forward, so the P release target would be March 1 2021 (regardless of whether Octopus is early or late). Thanks! sage

4 years, 10 months

1
0
0 0

Completing migration to dev@ceph.io

by Sage Weil

We created dev(a)ceph.io several weeks back. There has been plenty of time now for everyone to get subscribed, so please now direct all dev discussion for Ceph proper to dev(a)ceph.io and use this list for ceph kernel client development only. Avoid copying both lists unless the discussion is relevant both for userspace and the kernel. https://lists.ceph.io/postorius/lists/dev.ceph.io/ Thanks! sage

4 years, 10 months

1
0
0 0

[Proposal] Extend SSE-KMS in Rados Gateway to support HashiCorp Vault

by Andrea

Hi everyone, I am currently working on a project where the Rados Gateway SSE-KMS feature is required; I cannot rely on the solution based on Barbican and Vault is the KMS of choice. For these reason, here [1] a proposal to abstract the key management service and an initial sketch for a refactoring strategy to support HashiCorp Vault. I am currently not planning on adding any new SSE strategy (such as SSE-S3), although the refactoring might simplify its implementation. Thanks. [1] https://pad.ceph.com/p/rgw_sse-kms -- Andrea Baglioni

4 years, 10 months

2
2
0 0

ceph-bluestore-tool bluefs-bdev-expand on mimic

by Valentin Bajrami

Hello everyone, Recently I was trying to expand an OSD disk using bluefs-bdev-expand. Since this OSD is a virtual machine managed by oVirt, I first resized the virtual disk of the virtual machine. The result from ''lsblk'' was: vdb 252:16 0 200G 0 disk └─ceph--bc94ec07--2ac3--4965--8750--bb9e42ec670f-osd--block--aa7de90e--0442--4cd9--9927--a17dd666ea74 253:2 0 100G 0 lvm As you can see the block device /dev/vdb has 200G but the logical volume is still 100G. I then used the following: lvextend -L+100G /dev/ceph--bc94ec07--2ac3--4965--8750--bb9e42ec670f/osd--block--aa7de90e--0442--4cd9--9927--a17dd666ea74 After using lvextend I then ran: # ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-7/ inferring bluefs devices from bluestore path { "/var/lib/ceph/osd/ceph-7//block": { "osd_uuid": "aa7de90e-0442-4cd9-9927-a17dd666ea74", "size": 107372085248, "btime": "2019-07-02 13:56:58.589154", "description": "main", "bluefs": "1", "ceph_fsid": "6effd8df-d109-4ef3-9cfa-c68f9756a54b", "kv_backend": "rocksdb", "magic": "ceph osd volume v026", "mkfs_done": "yes", "osd_key": "AQCJRhtdZZgTEBAA7G7fzTyj0d2r4RRa/uxaZQ==", "ready": "ready", "whoami": "7" } } So the command: ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-7 results in an error I currently cannot reproduce but the bottom line is that it doesn't expand. Is bluefs-bdev-expand supported on mimic? Is there a clean way to expand an OSD ? Right now I'm running the following from ceph-deploy: # ceph-deploy disk zap vm1-osd1 /dev/vdb # ceph-deploy osd create vm1-osd1 --data /dev/vdb The above deletes everything and recreates it which is really not ideal. Any suggestion? Thanks in advance. -- Met vriendelijke groeten, Valentin Bajrami Target Holding

4 years, 10 months

2
1
0 0

outcome::result<> for "error handing" in crimson

by kefu chai

hi guys, i just came across boost::outcome[0]. it reminded me the discussion we had back in Barcelona regarding to the error handling in crimson. well, strictly speaking, it's not limited to errors. it covers the non-error handling as well. the question is: shall we start prototyping the crimson variant of outcome<> now? if yes, probably we can leverage boost::outcome<>? a little bit background: seastar uses exception for propagating the error. but it incurs runtime overhead. because, to throw an exception, the libstdc++ runtime needs to acquire a global lock. well, some of us might want to argue, why not just return a future<Result, Error>? let me use an example here, imagine we are handling a write request in OSD. we might need to go through following steps: 1. perform some sanity tests, for instance, to see if the OSD is ready for handling the write request 2. try to read the object info of the object from local storage to see if it already exists 3. write to the object to the local storage, and send write requests to replica OSDs (assuming it's in a replicated pool), wait for the completions of these write ops. 4. update the statistics 5. reply to the client and it's intuitive to structure these steps using chained continuation like do_with(std::move(request), [this](auto request) { return perform_tests(request->object_id).then([request, this] { return read_object_info(request->object_id); }).then([request, this](optional<object_info> object_info) { return when_all( write_local(request->object_id, request->offset, request->data), parallel_for_each(replica_osds, [request](auto replica_osd) { return replica_osd->write_remote(request->object_id, request->offset, request->data) })); }).then([write_size=request.data.size(), this] { update_statistics(write_size); return reply_to(reply_t::success, request); }); }).handle_exception([](auto exception) { return reply_to(reply_t::failure, exception.error_code, request); }); in which, if any test fails in step#1, we either need to wait until the OSD is ready, or just need to bail out, and skip the following steps. the "handle_exception()" clause is used to handle the "bail out" case, where we cannot do anything to serve the request. for instance, the request is invalid. we want to differentiate two types of errors. one of them are actually exceptions which does not happen often in real world, and we don't need/want to optimize for this case. but the other case could be normal. for instance, it's fairly normal that an object does not exist yet, when we are trying to write to it. and we do want to be performant when handling these "errors" in this category, and also, we want to do this in a convenient way just like handling exceptions. because, we need an efficient way to convey the message to caller that "please skip the following continuations, and i would go to this handling route instead". if my memory serves me correctly, we think that we need to create a wrapper around seastar::future<> to allow the caller to do something like // a helper to run func or skip it template<typename Func> auto ignore_on_error(Func&& f) { return [f=std::move(f)](auto&& t) { return t.is_value() ? f(t.value()) ? t; } } return read_object_info(oid).then( return ignore_on_error([](object_info& oi) { return handle_write_with_object_info(std::move(oi)); }).then([](auto t) { return handle_write_without_object_info(); }); ); in the example above, i assume we will do something very different depending on if the object's existence. cheers, --- [0] https://www.boost.org/doc/libs/1_70_0/libs/outcome/doc/html/index.html -- Regards Kefu Chai

4 years, 10 months

1
0
0 0

how to do cross compile ceph to aarch64 on x86

by huang jun

Hi, all I'm bit curious whether ceph support cross compile to aarch64 on x86, any documents introduce about this?

4 years, 10 months

2
2
0 0

Delaying this week's CDM to next week

by Sage Weil

The next Ceph Developer Monthly falls on this Wednesday, July 3. Since this is adjacent to a US holidy it's likely many people won't make it. More importantly, we failed to send out an agenda last week. Let's delay this until next week, Jul 10 9PM ET (Jul 11 0100 UTC). Thanks! sage

4 years, 10 months

1
0
0 0

prefill for read-only tests in cbt

by kefu chai

hi Mark, i am working on using cbt for testing crimson. as you might known, crimson-osd is currently using a variant of memstore as its object store backend. so it'd be very easy for crimson-osd to run out of memory, as the default run "time" of cbt radosbench is 300 seconds. currently, each radosbench run is composed of 3 steps: 1. prefill // optional, enabled if "prefill_time" or "prefill_objects" is set 2. write. 3. read // optional, enabled if "write_only" is not test the pain point is that the run times for write and read step are specified using the same setting -- "time". so, i am wondering if it's okay to add an option named "read_only" to skip the "write" step to let the prefill to prepare the testbench for the read test. so we can specify the time for prefilling and the time for read separately? as an alternative, we could have an option for "write_time", which defaults to "time" if not specified, but if it takes precedence over "time" if specified. and it's "0", the "write" step will be skipped. what do you think? -- Regards Kefu Chai

4 years, 10 months

2
2
0 0

2024

2023

2022

2021

2020

2019

Dev