Here we go again! As usual the conference theme is intended to
inspire, not to restrict; talks on any topic in the world of free and
open source software, hardware, etc. are most welcome, and Ceph talks
definitely fit.
I've added this to https://pad.ceph.com/p/cfp-coordination as well.
-------- Forwarded Message --------
Subject: [lca-announce] linux.conf.au 2020 - Call for Sessions and
Miniconfs now open!
Date: Tue, 25 Jun 2019 21:19:43 +1000
From: linux.conf.au Announcements <lca-announce(a)lists.linux.org.au>
Reply-To: lca-announce(a)lists.linux.org.au
To: lca-announce(a)lists.linux.org.au
The linux.conf.au 2020 organising team is excited to announce that the
linux.conf.au 2020 Call for Sessions and Call for Miniconfs are now open!
These will stay open from now until Sunday 28 July Anywhere on Earth
(AoE) (https://en.wikipedia.org/wiki/Anywhere_on_Earth).
Our theme for linux.conf.au 2020 is "Who's Watching", focusing on
security, privacy and ethics.
As big data and IoT-connected devices become more pervasive, it's no
surprise that we're more concerned about privacy and security than ever
before.
We've set our sights on how open source could play a role in maximising
security and protecting our privacy in times of uncertainty.
With the concept of privacy continuing to blur, open source could be the
solution to give us '2020 vision'.
Call for Sessions
Would you like to talk in the main conference of linux.conf.au 2020?
The main conference runs from Wednesday to Friday, with multiple streams
catering for a wide range of interest areas.
We welcome you to submit a session
(https://linux.conf.au/programme/sessions/) proposal for either a talk
or tutorial now.
Call for Miniconfs
Miniconfs are dedicated day-long streams focusing on single topics,
creating a more immersive experience for delegates than a session.
Miniconfs are run on the first two days of the conference before the
main conference commences on Wednesday.
If you would like to organise a miniconf
(https://linux.conf.au/programme/miniconfs/) at linux.conf.au, we want
to hear from you.
Have we got you interested?
You can find out how to submit your session or miniconf proposals at
https://linux.conf.au/programme/proposals/.
If you have any other questions you can contact us via email at
contact(a)lca2020.linux.org.au.
We are looking forward to reading your submissions.
linux.conf.au 2020 Organising Team
---
Read this online at
https://lca2020.linux.org.au/news/call-for-sessions-miniconfs-now-open/
_______________________________________________
lca-announce mailing list
lca-announce(a)lists.linux.org.au
http://lists.linux.org.au/mailman/listinfo/lca-announce
Hi all,
I've had a chat with Sage & Dan Mick about the current state of
telemetry, and I'd like to propose a few ideas to hopefully improve it
and make the data collected more relevant.
The current data is quite limited. I was able to take a look at, say,
how many pools out there (well, of the ~300ish clusters that ever
reported) have a non-2^n pg_num, but seeing whether this affects
performance or data distribution was not possible.
My goal is to have telemetry data that allows us to make more informed
decisions about what matters to the user base; the comments below are
not necessarily ordered by relevance, since they grew out of a thread on
looking at the current data reported.
Curious about your thoughts - too detailed information? Anything you'd
like to see included? What'd help you in your area?
- The crash section does expose actual hostnames ("entity_name"). If we
want to preserve that we can see whether it's the same entity crashing
or another, I'd propose that, similar to report_id, we generate a
report_secret_salt in the plugin that we don't share with the server -
we can then use this to hash any potential strings consistently.
(This will change with Sage's pending PR to point this at a different
channel.)
- The pool reporting should include:
- EC policy (plugin, parameters)
- I can tell whether a pool is EC, and k+m, but not even k or m
individually ...
- Pool application association (and it'd be lovely if we could tell
data/metadata pools apart for CephFS/RBD)
- Possibly per-pool usage?
- Report should included enabled plugins
- Plugins should have a standard API call to report their own telemetry
- e.g., balancer/pg_autoscaler settings come to mind
- The current way how the ceph
version/os/distro/kernel/description/cpu/arch fields are aggregated
individually makes these very difficult to analyze. In case
you're not familiar, it looks something like this (trimmed):
"kernel_version": {
"4%15%0-54-generic": 6,
"4%15%0-50-generic": 20,
"4%18%0-25-generic": 3
},
"ceph_version": {
"ceph version 14%2%1 (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus (stable)": 29
},
"kernel_description": {
"#58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019": 6,
"#54-Ubuntu SMP Mon May 6 18:46:08 UTC 2019": 20,
"#26~18%04%1-Ubuntu SMP Thu Jun 27 07:28:31 UTC 2019": 3
},
"cpu": {
"Intel(R) Xeon(R) CPU E5-1650 v3 @ 3%50GHz": 20,
"Intel(R) Core(TM) i7-7700 CPU @ 3%60GHz": 9
}
}
I'd rather see it aggregated at the tuple level:
environment: [
{
kernel_version: "4%15%0-54-generic",
arch: "x86_64",
distro: "ubuntu",
cpu: "Intel(R) Core(TM) i7-7700 CPU @ 3%60GHz",
kernel_description: "...",
ceph_version: "...",
count: 6
},
...
]
- The OSD section could be revamped and expose more details. This is
overly simplified. Is BlueStore used with rotational_media? NVMe?
SSD? Is FileStore? On which media and, possibly, how big are the
WAL/RocksDB/data partitions?
Is encryption used? Are these ceph-volume, ceph-disk, ...? Which file
system is used with Filestore? Do we have enabled uring? Etc.
(In short, probably ceph osd metadata should grow to encompass this
and telemetry would scrape a subset of that.)
- While we're on hardware, I'd like to know if there's a separate
cluster/public network and if we can deduce the hardware that's
associated with (10? 25? VLAN? bond? etc)
- Are there any msgr features we'd want to know about? v2? Encryption?
- Anything on the MDS?
- RFC: include "ceph features"?
- There's no actual performance data (commit latency or anything else).
Could we grab a histogram or at least min/max/avg/stddev/sum(?) of
some high-level metrics since the last report from the prometheus
instance most more recent environments would likely have?
(I'd like to see if we can deduce that a certain update made the
clusters in the field slower or faster.)
- I'd love to see data on OSD utilization/variance as well. (I could
have used that this morning to check how this varied for clusters with
non-2^n pg_num, but it'd also help us show the improvement over time
as folks roll out the new automations etc.)
We can either grab this from the OSD daemons, or again ask Prometheus.
- Do we know anything about the client versions talking to us, beyond
require_min_compat_client?
- We may want to get more details on the services/gateways (iSCSI, NFS,
CIFS). Even just if they're used would be good.
- I'd pull contact/organization/description into a separate section and
channel. We'll need to also document what this information is used
for.
Basically, this is a long laundry list of wishes for more detail. ;-)
I'm wondering what the best way of tracking all wishes and then deciding
on which to fulfil is.
Regards,
Lars
--
SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
"Architects should open possibilities and not determine everything." (Ueli Zbinden)
Hi Folks,
Perf meeting is on in ~15 minutes! No set agenda for today, other than
that the bluestore cache rebase is almost set to merge. Please feel
free to add your own! Otherwise, probably a short meeting today.
Etherpad:
https://pad.ceph.com/p/performance_weekly
Bluejeans:
https://bluejeans.com/908675367
Thanks,
Mark
Can someone help me to setup access to external cephfs cluster without 3rd
parties like rook etc.
I wasn't able to follow your instructions from GitHub.
I am looking for step by step instructions. My Ceph version is "nautilus".
Thanks,
Konstantin
Hello,
We are a factory, professional in making various kinds of promitional products, such as
non-woven drawstring bag with customer's logo on it
plastic advertising fan with customer's logo on it
pvc sun visor cap with customer's logo on it
cotton t-shirt with customer's logo on it
ball-point pen with customer's logo on it
...........
We can offer very good price and fast delivery.
Do you want to receive our online catalog and have a check ?
Please give a reply to this email.
Thanks.
This is the second bug fix release of Ceph Nautilus release series. We
recommend all Nautilus users upgrade to this release. For upgrading from older
releases of ceph, general guidelines for upgrade to nautilus must be followed.
Notable Changes
---------------
* The no{up,down,in,out} related commands have been revamped. There are now 2
ways to set the no{up,down,in,out} flags: the old 'ceph osd [un]set <flag>'
command, which sets cluster-wide flags; and the new 'ceph osd [un]set-group
<flags> <who>' command, which sets flags in batch at the granularity of any
crush node, or device class.
* radosgw-admin introduces two subcommands that allow the managing of
expire-stale objects that might be left behind after a bucket reshard in
earlier versions of RGW. One subcommand lists such objects and the other
deletes them. Read the troubleshooting section of the dynamic resharding docs
for details.
* Earlier Nautilus releases (14.2.1 and 14.2.0) have an issue where deploying a
single new (Nautilus) BlueStore OSD on an upgraded cluster (i.e. one that was
originally deployed pre-Nautilus) breaks the pool utilization stats reported
by ceph df. Until all OSDs have been reprovisioned or updated (via
ceph-bluestore-tool repair), the pool stats will show values that are lower
than the true value. This is resolved in 14.2.2, such that the cluster only
switches to using the more accurate per-pool stats after all OSDs are 14.2.2
(or later), are BlueStore, and (if they were created prior to Nautilus) have
been updated via the repair function.
* The default value for mon_crush_min_required_version has been changed from
firefly to hammer, which means the cluster will issue a health warning if
your CRUSH tunables are older than hammer. There is generally a small (but
non-zero) amount of data that will move around by making the switch to hammer
tunables.
If possible, we recommend that you set the oldest allowed client to hammer or
later. You can tell what the current oldest allowed client is with:
ceph osd dump | grep min_compat_client
If the current value is older than hammer, you can tell whether it is safe to
make this change by verifying that there are no clients older than hammer
current connected to the cluster with:
ceph features
The newer straw2 CRUSH bucket type was introduced in hammer, and ensuring
that all clients are hammer or newer allows new features only supported for
straw2 buckets to be used, including the crush-compat mode for the Balancer.
For a detailed changelog please refer to the official release notes
entry at the ceph blog: https://ceph.com/releases/v14-2-2-nautilus-released/
Getting Ceph
------------
* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-14.2.2.tar.gz
* For packages, see http://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: 4f8fa0a0024755aae7d95567c63f11d6862d55be
Adding Alfredo, Ken, and dev@,
I think there are some open questions about when the needed
py3 dependencies will be in place?
sage
On Mon, 22 Jul 2019, Ricardo Dias wrote:
> Hi Sage,
>
> This morning during our daily dashboard standup was raised the question
> of whether python 2 will still need to be supported in Octopus.
>
> The information of the discussion around this topic in the mailing lists
> is not very clear.
> My understanding is that we're going to remove python 2 support in
> Octopus but it hasn't still happened in master.
>
> Can we get a clear statement in the mailing list if the above is
> correct, that we indeed will move to a python3 only support?
>
> Thanks,
> --
> Ricardo Dias
> Senior Software Engineer - Storage Team
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
> HRB 21284
> (AG Nürnberg)
>
>
A broad question for the cephfs user community:
I've been looking at adding inline_data write support for kcephfs [1].
It's non-trivial to handle correctly in the kernel (due to the more
complex locking, primarily), and I'm finding some bugs in what's already
there.
Is anyone actually enabling inline_data in their environments? Is this
something we should be expending effort to support?
Thanks,
--
Jeff Layton <jlayton(a)redhat.com>
[1]: http://docs.ceph.com/docs/master/cephfs/experimental-features/#inline-data