Hello All,
I am a graduate student at Iowa State University at Ames, IA
in the Computer Engineering department. I am very much interested in
working on the "Teuthology Scheduling Improvements" project. At the ISU, I
have been specializing in Embedded and Storage Systems. Recently, I have
worked on benchmarking and memory tracing of various databases (Redis,
RocksDB, MongoDB, Memcached, etc.) for finding improvements for memory
usage of the databases by utilizing persistent memory. Also, during my
studies, I have taken courses on Operating Systems, Real-Time Systems, Data
Storage Systems.
In the last few days, I have looked at the documentation at dcos.ceph.com and
the source code of the Teuthology framework at
https://github.com/ceph/teuthology to understand the motivation behind it
and the architecture. If you have any pointers related to the project that
can be very useful to evaluate the project work, please let me know.
Thanks & Regards,
Prakhar Bansal
There is a general documentation meeting called the "DocuBetter Meeting",
and it is held every two weeks. The next DocuBetter Meeting will be on
March 25, 2020 at 1800 PST, and will run for thirty minutes. Everyone with
a documentation-related request or complaint is invited. The meeting will
be held here: https://bluejeans.com/908675367
Send documentation-related requests and complaints to me by replying to
this email and CCing me at zac.dover(a)gmail.com.
This message will be sent to dev(a)ceph.io every Monday morning, North
American time.
The next DocuBetter meeting is scheduled for:
25 Mar 2020 1800 PST
25 Mar 2020 0100 UTC
26 Mar 2020 1100 AEST
Etherpad: https://pad.ceph.com/p/Ceph_Documentation
Meeting: https://bluejeans.com/908675367
Thanks, everyone.
Zac Dover
Hi everyone,
We've scheduled tentative time slots for an online Ceph Developer Summit
for Pacific. The times and topics are on this pad
https://pad.ceph.com/p/cds-pacific
but I've also added sessions to the Ceph Community google calendar, which
probably a better reference since it isn't subject to time zone math
errors and typos.
https://calendar.google.com/calendar/b/1?cid=OXRzOWM3bHQ3dTF2aWMyaWp2dnFxbG…
There isn't a dedicated RBD session scheduled, but there probably should
be. Jason, when do you want to do it?
There are topics listed in the invites, but it is probably easier to
manage the agenda on the pad.
Please let us know if we should make adjustments to these times.
Everyone's schedules are pretty fluid with the pandemic and wonky work
schedules, but it's easy to move these meetings around or schedule
separate time slots to discuss other topics to accomodate conflicts and
time zones.
Thanks!
sage
Hi everyone,
As we wrap up Octopus and kick of development for Pacific, now it seems
like a good idea to sort out what to call the Q release.
Traditionally/historically, these have always been names of cephalopod
species--usually the "common name", but occasionally a latin name
(infernalis).
Q is a bit of a challenge since there aren't many of either that start
with Q. Nick Barcet found one: quebecoceras, an extinct genus of nautilus
(https://en.wikipedia.org/wiki/Quebecoceras).
The only other Q cephalopod reference I could find was Squidward Q
Tentacles, a character (octopus, strangely) from Spongebob Squarepants,
and Yehuda figured out that the Q stands for Quincy.
So far that's it. If you can find any other options, please catalog them
on the etherpad:
https://pad.ceph.com/p/q
(or even get a head start on future releases.. they're always the
single-letter pads, e.g., https://pad.ceph.com/p/r).
sage
I modified our sync-push script a little so the symlinks are relative
instead of absolute. (https://github.com/ceph/ceph-build/pull/1534)
In addition to that change, if you add `--copy-links` to your rsync
command, it syncs as expected now.
On 3/24/20 1:48 PM, David Galloway wrote:
> I have tried on-and-off all morning to get the magical combination of
> rsync flags to get this working and I don't think it's possible.
>
> The plan is, for each version released, have that version go in its
> respective /rpm-X.X.X directory. Once all the packages are pushed, the
> /rpm-octopus symlink is then pointed to the new version rpm-X.X.X dir.
>
> @Sage what are your thoughts on rsyncd currently being broken due to
> this new directory structure? It has also broken our mirror syncing
> procedure. e.g., http://au.ceph.com/ doesn't have {deb,rpm}-octopus dirs.
>
> On 3/24/20 1:32 PM, Eric Goirand wrote:
>> Thanks Abhishek for the rpm-15.2.0 link.
>> @David are there any plans to have rsync still work with rpm-octopus as
>> it is the case for http access ?
>> If you plan to move to rpm-15.2.0, would it still be the same directory
>> when you go to 15.2.1 or 15.2.x as it will be difficult to change the
>> DNF or YUM URLs to cope with these changes.
>> Thanks and Regards,
>> Eric.
>>
>> On Tue, Mar 24, 2020 at 3:10 PM Abhishek Lekshmanan <abhishek(a)suse.com
>> <mailto:abhishek@suse.com>> wrote:
>>
>> Eric Goirand <egoirand(a)redhat.com <mailto:egoirand@redhat.com>> writes:
>>
>> > Hello Abhishek,
>> Hi Eric
>> >
>> > Many Many Thanks to all the contributors for this awesome milestone !
>> > Two thumbs up to you all !
>>
>> The directory structure changed (buried somewhere in the release
>> announcment) mainly so that we can make it easy to retire a bad release
>> etc. so rpm-octopus should be a symlink to
>> https://download.ceph.com/rpm-15.2.0/
>>
>> I've CC'ed David who's in charge of the infra.
>>
>>
>> > Note however that the Octopus repository via rsync is no longer
>> accessible :
>> > rsync -aiv --dry-run rsync://download.ceph.com/ceph/rpm-octopus/
>> <http://download.ceph.com/ceph/rpm-octopus/> .
>> > receiving incremental file list
>> > rsync: change_dir "/rpm-octopus" (in ceph) failed: No such file or
>> > directory (2)
>> >
>> > But, is still accessible using http :
>> http://download.ceph.com/rpm-octopus/
>> >
>> > Would you know who to contact to have the rsync access work again ?
>> >
>> > Many Thanks,
>> >
>> > Eric.
>> >
>> >
>> > On Tue, Mar 24, 2020 at 12:41 PM Abhishek Lekshmanan
>> <abhishek(a)suse.com <mailto:abhishek@suse.com>>
>> > wrote:
>> >
>> >>
>> >> We're happy to announce the first stable release of Octopus v15.2.0.
>> >> There are a lot of changes and new features added, we advise
>> everyone to
>> >> read the release notes carefully, and in particular the upgrade
>> notes,
>> >> before upgrading. Please refer to the official blog entry
>> >> https://ceph.io/releases/v15-2-0-octopus-released/ for a detailed
>> >> version with links & changelog.
>> >>
>> >> This release wouldn't have been possible without the support of the
>> >> community, this release saw contributions from over 330
>> developers & 80
>> >> organizations, and we thank everyone for making this release happen.
>> >>
>> >> Major Changes from Nautilus
>> >> ---------------------------
>> >> General
>> >> ~~~~~~~
>> >> * A new deployment tool called **cephadm** has been introduced that
>> >> integrates Ceph daemon deployment and management via containers
>> >> into the orchestration layer.
>> >> * Health alerts can now be muted, either temporarily or permanently.
>> >> * Health alerts are now raised for recent Ceph daemons crashes.
>> >> * A simple 'alerts' module has been introduced to send email
>> >> health alerts for clusters deployed without the benefit of an
>> >> existing external monitoring infrastructure.
>> >> * Packages are built for the following distributions:
>> >> - CentOS 8
>> >> - CentOS 7 (partial--see below)
>> >> - Ubuntu 18.04 (Bionic)
>> >> - Debian Buster
>> >> - Container images (based on CentOS 8)
>> >>
>> >> Note that the dashboard, prometheus, and restful manager modules
>> >> will not work on the CentOS 7 build due to Python 3 module
>> >> dependencies that are missing in CentOS 7.
>> >>
>> >> Besides this packages built by the community will also
>> available for the
>> >> following distros:
>> >> - Fedora (33/rawhide)
>> >> - openSUSE (15.2, Tumbleweed)
>> >>
>> >> Dashboard
>> >> ~~~~~~~~~
>> >> The mgr-dashboard has gained a lot of new features and functionality:
>> >>
>> >> * UI Enhancements
>> >> - New vertical navigation bar
>> >> - New unified sidebar: better background task and events
>> notification
>> >> - Shows all progress mgr module notifications
>> >> - Multi-select on tables to perform bulk operations
>> >>
>> >> * Dashboard user account security enhancements
>> >> - Disabling/enabling existing user accounts
>> >> - Clone an existing user role
>> >> - Users can change their own password
>> >> - Configurable password policies: Minimum password
>> complexity/length
>> >> requirements
>> >> - Configurable password expiration
>> >> - Change password after first login
>> >>
>> >> New and enhanced management of Ceph features/services:
>> >>
>> >> * OSD/device management
>> >> - List all disks associated with an OSD
>> >> - Add support for blinking enclosure LEDs via the orchestrator
>> >> - List all hosts known by the orchestrator
>> >> - List all disks and their properties attached to a node
>> >> - Display disk health information (health prediction and SMART
>> data)
>> >> - Deploy new OSDs on new disks/hosts
>> >> - Display and allow sorting by an OSD's default device class in
>> the OSD
>> >> table
>> >> - Explicitly set/change the device class of an OSD, display and
>> sort
>> >> OSDs by
>> >> device class
>> >>
>> >> * Pool management
>> >> - Viewing and setting pool quotas
>> >> - Define and change per-pool PG autoscaling mode
>> >>
>> >> * RGW management enhancements
>> >> - Enable bucket versioning
>> >> - Enable MFA support
>> >> - Select placement target on bucket creation
>> >>
>> >> * CephFS management enhancements
>> >> - CephFS client eviction
>> >> - CephFS snapshot management
>> >> - CephFS quota management
>> >> - Browse CephFS directory
>> >>
>> >> * iSCSI management enhancements
>> >> - Show iSCSI GW status on landing page
>> >> - Prevent deletion of IQNs with open sessions
>> >> - Display iSCSI "logged in" info
>> >>
>> >> * Prometheus alert management
>> >> - List configured Prometheus alerts
>> >>
>> >> RADOS
>> >> ~~~~~
>> >> * Objects can now be brought in sync during recovery by copying only
>> >> the modified portion of the object, reducing tail latencies during
>> >> recovery.
>> >> * Ceph will allow recovery below *min_size* for Erasure coded pools,
>> >> wherever possible.
>> >> * The PG autoscaler feature introduced in Nautilus is enabled for
>> >> new pools by default, allowing new clusters to autotune *pg num*
>> >> without any user intervention. The default values for new pools
>> >> and RGW/CephFS metadata pools have also been adjusted to perform
>> >> well for most users.
>> >> * BlueStore has received several improvements and performance
>> >> updates, including improved accounting for "omap" (key/value)
>> >> object data by pool, improved cache memory management, and a
>> >> reduced allocation unit size for SSD devices. (Note that by
>> >> default, the first time each OSD starts after upgrading to octopus
>> >> it will trigger a conversion that may take from a few minutes to a
>> >> few hours, depending on the amount of stored "omap" data.)
>> >> * Snapshot trimming metadata is now managed in a more efficient and
>> >> scalable fashion.
>> >>
>> >> RBD block storage
>> >> ~~~~~~~~~~~~~~~~~
>> >> * Mirroring now supports a new snapshot-based mode that no longer
>> requires
>> >> the journaling feature and its related impacts in exchange for
>> the loss
>> >> of point-in-time consistency (it remains crash consistent).
>> >> * Clone operations now preserve the sparseness of the underlying
>> RBD image.
>> >> * The trash feature has been improved to (optionally) automatically
>> >> move old parent images to the trash when their children are all
>> >> deleted or flattened.
>> >> * The trash can be configured to automatically purge on a defined
>> schedule.
>> >> * Images can be online re-sparsified to reduce the usage of
>> zeroed extents.
>> >> * The `rbd-nbd` tool has been improved to use more modern kernel
>> >> interfaces.
>> >> * Caching has been improved to be more efficient and performant.
>> >> * `rbd-mirror` automatically adjusts its per-image memory usage based
>> >> upon its memory target.
>> >> * A new persistent read-only caching daemon is available to
>> offload reads
>> >> from
>> >> shared parent images.
>> >>
>> >> RGW object storage
>> >> ~~~~~~~~~~~~~~~~~~
>> >> * New `Multisite Sync Policy` primitives for per-bucket replication.
>> >> (EXPERIMENTAL)
>> >> * S3 feature support:
>> >> - Bucket Replication (EXPERIMENTAL)
>> >> - `Bucket Notifications`_ via HTTP/S, AMQP and Kafka
>> >> - Bucket Tagging
>> >> - Object Lock
>> >> - Public Access Block for buckets
>> >> * Bucket sharding:
>> >> - Significantly improved listing performance on buckets with many
>> >> shards.
>> >> - Dynamic resharding prefers prime shard counts for improved
>> >> distribution.
>> >> - Raised the default number of bucket shards to 11.
>> >> * Added `HashiCorp Vault Integration`_ for SSE-KMS.
>> >> * Added Keystone token cache for S3 requests.
>> >>
>> >> CephFS distributed file system
>> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> >> * Inline data support in CephFS has been deprecated and will
>> likely be
>> >> removed in a future release.
>> >> * MDS daemons can now be assigned to manage a particular file
>> system via
>> >> the
>> >> new `mds_join_fs` option.
>> >> * MDS now aggressively asks idle clients to trim caps which improves
>> >> stability
>> >> when file system load changes.
>> >> * The mgr volumes plugin has received numerous improvements to
>> support
>> >> CephFS
>> >> via CSI, including snapshots and cloning.
>> >> * cephfs-shell has had numerous incremental improvements and bug
>> fixes.
>> >>
>> >>
>> >> Upgrading from Mimic or Nautilus
>> >> --------------------------------
>> >> You can monitor the progress of your upgrade at each stage with the
>> >> `ceph versions` command, which will tell you what ceph
>> version(s) are
>> >> running for each type of daemon.
>> >>
>> >> Instructions
>> >> ~~~~~~~~~~~~
>> >> #. Make sure your cluster is stable and healthy (no down or
>> >> recovering OSDs). (Optional, but recommended.)
>> >>
>> >> #. Set the `noout` flag for the duration of the upgrade. (Optional,
>> >> but recommended.)::
>> >>
>> >> # ceph osd set noout
>> >>
>> >> #. Upgrade monitors by installing the new packages and restarting the
>> >> monitor daemons. For example, on each monitor host,::
>> >>
>> >> # systemctl restart ceph-mon.target
>> >>
>> >> Once all monitors are up, verify that the monitor upgrade is
>> >> complete by looking for the `octopus` string in the mon
>> >> map. The command::
>> >>
>> >> # ceph mon dump | grep min_mon_release
>> >>
>> >> should report::
>> >>
>> >> min_mon_release 15 (nautilus)
>> >>
>> >> If it doesn't, that implies that one or more monitors hasn't been
>> >> upgraded and restarted and/or the quorum does not include all
>> monitors.
>> >>
>> >> #. Upgrade `ceph-mgr` daemons by installing the new packages and
>> >> restarting all manager daemons. For example, on each manager
>> host,::
>> >>
>> >> # systemctl restart ceph-mgr.target
>> >>
>> >> Verify the `ceph-mgr` daemons are running by checking `ceph
>> >> -s`::
>> >>
>> >> # ceph -s
>> >>
>> >> ...
>> >> services:
>> >> mon: 3 daemons, quorum foo,bar,baz
>> >> mgr: foo(active), standbys: bar, baz
>> >> ...
>> >>
>> >> #. Upgrade all OSDs by installing the new packages and restarting the
>> >> ceph-osd daemons on all OSD hosts::
>> >>
>> >> # systemctl restart ceph-osd.target
>> >>
>> >> Note that the first time each OSD starts, it will do a format
>> >> conversion to improve the accounting for "omap" data. This may
>> >> take a few minutes to as much as a few hours (for an HDD with lots
>> >> of omap data). You can disable this automatic conversion with::
>> >>
>> >> # ceph config set osd bluestore_fsck_quick_fix_on_mount false
>> >>
>> >> You can monitor the progress of the OSD upgrades with the
>> >> `ceph versions` or `ceph osd versions` commands::
>> >>
>> >> # ceph osd versions
>> >> {
>> >> "ceph version 13.2.5 (...) mimic (stable)": 12,
>> >> "ceph version 15.2.0 (...) octopus (stable)": 22,
>> >> }
>> >>
>> >> #. Upgrade all CephFS MDS daemons. For each CephFS file system,
>> >>
>> >> #. Reduce the number of ranks to 1. (Make note of the original
>> >> number of MDS daemons first if you plan to restore it later.)::
>> >>
>> >> # ceph status
>> >> # ceph fs set <fs_name> max_mds 1
>> >>
>> >> #. Wait for the cluster to deactivate any non-zero ranks by
>> >> periodically checking the status::
>> >>
>> >> # ceph status
>> >>
>> >> #. Take all standby MDS daemons offline on the appropriate
>> hosts with::
>> >>
>> >> # systemctl stop ceph-mds@<daemon_name>
>> >>
>> >> #. Confirm that only one MDS is online and is rank 0 for your FS::
>> >>
>> >> # ceph status
>> >>
>> >> #. Upgrade the last remaining MDS daemon by installing the new
>> >> packages and restarting the daemon::
>> >>
>> >> # systemctl restart ceph-mds.target
>> >>
>> >> #. Restart all standby MDS daemons that were taken offline::
>> >>
>> >> # systemctl start ceph-mds.target
>> >>
>> >> #. Restore the original value of `max_mds` for the volume::
>> >>
>> >> # ceph fs set <fs_name> max_mds <original_max_mds>
>> >>
>> >> #. Upgrade all radosgw daemons by upgrading packages and restarting
>> >> daemons on all hosts::
>> >>
>> >> # systemctl restart ceph-radosgw.target
>> >>
>> >> #. Complete the upgrade by disallowing pre-Octopus OSDs and enabling
>> >> all new Octopus-only functionality::
>> >>
>> >> # ceph osd require-osd-release octopus
>> >>
>> >> #. If you set `noout` at the beginning, be sure to clear it with::
>> >>
>> >> # ceph osd unset noout
>> >>
>> >> #. Verify the cluster is healthy with `ceph health`.
>> >>
>> >> If your CRUSH tunables are older than Hammer, Ceph will now
>> issue a
>> >> health warning. If you see a health alert to that effect, you can
>> >> revert this change with::
>> >>
>> >> ceph config set mon mon_crush_min_required_version firefly
>> >>
>> >> If Ceph does not complain, however, then we recommend you also
>> >> switch any existing CRUSH buckets to straw2, which was added back
>> >> in the Hammer release. If you have any 'straw' buckets, this will
>> >> result in a modest amount of data movement, but generally nothing
>> >> too severe.::
>> >>
>> >> ceph osd getcrushmap -o backup-crushmap
>> >> ceph osd crush set-all-straw-buckets-to-straw2
>> >>
>> >> If there are problems, you can easily revert with::
>> >>
>> >> ceph osd setcrushmap -i backup-crushmap
>> >>
>> >> Moving to 'straw2' buckets will unlock a few recent features, like
>> >> the `crush-compat` :ref:`balancer <balancer>` mode added back in
>> >> Luminous.
>> >>
>> >> #. If you are upgrading from Mimic, or did not already do so when you
>> >> upgraded to Nautlius, we recommened you enable the new :ref:`v2
>> >> network protocol <msgr2>`, issue the following command::
>> >>
>> >> ceph mon enable-msgr2
>> >>
>> >> This will instruct all monitors that bind to the old default port
>> >> 6789 for the legacy v1 protocol to also bind to the new 3300 v2
>> >> protocol port. To see if all monitors have been updated,::
>> >>
>> >> ceph mon dump
>> >>
>> >> and verify that each monitor has both a `v2:` and `v1:` address
>> >> listed.
>> >>
>> >> #. Consider enabling the :ref:`telemetry module <telemetry>` to send
>> >> anonymized usage statistics and crash information to the Ceph
>> >> upstream developers. To see what would be reported (without
>> actually
>> >> sending any information to anyone),::
>> >>
>> >> ceph mgr module enable telemetry
>> >> ceph telemetry show
>> >>
>> >> If you are comfortable with the data that is reported, you can
>> opt-in to
>> >> automatically report the high-level cluster metadata with::
>> >>
>> >> ceph telemetry on
>> >>
>> >> For more information about the telemetry module, see :ref:`the
>> >> documentation <telemetry>`.
>> >>
>> >>
>> >> Upgrading from pre-Mimic releases (like Luminous)
>> >> -------------------------------------------------
>> >> You *must* first upgrade to Mimic (13.2.z) or Nautilus (14.2.z)
>> before
>> >> upgrading to Octopus.
>> >>
>> >>
>> >> Upgrade compatibility notes
>> >> ---------------------------
>> >> * Starting with Octopus, there is now a separate repository directory
>> >> for each version on `download.ceph.com
>> <http://download.ceph.com>` (e.g., `rpm-15.2.0` and
>> >> `debian-15.2.0`). The traditional package directory that is named
>> >> after the release (e.g., `rpm-octopus` and `debian-octopus`) is
>> >> now a symlink to the most recently bug fix version for that
>> release.
>> >> We no longer generate a single repository that combines all bug fix
>> >> versions for a single named release.
>> >>
>> >> * The RGW "num_rados_handles" has been removed.
>> >> If you were using a value of "num_rados_handles" greater than 1
>> >> multiply your current "objecter_inflight_ops" and
>> >> "objecter_inflight_op_bytes" paramaeters by the old
>> >> "num_rados_handles" to get the same throttle behavior.
>> >>
>> >> * Ceph now packages python bindings for python3.6 instead of
>> >> python3.4, because python3 in EL7/EL8 is now using python3.6
>> >> as the native python3. see the `announcement`_
>> >> for more details on the background of this change.
>> >>
>> >> * librbd now uses a write-around cache policy be default,
>> >> replacing the previous write-back cache policy default.
>> >> This cache policy allows librbd to immediately complete
>> >> write IOs while they are still in-flight to the OSDs.
>> >> Subsequent flush requests will ensure all in-flight
>> >> write IOs are completed prior to completing. The
>> >> librbd cache policy can be controlled via a new
>> >> "rbd_cache_policy" configuration option.
>> >>
>> >> * librbd now includes a simple IO scheduler which attempts to
>> >> batch together multiple IOs against the same backing RBD
>> >> data block object. The librbd IO scheduler policy can be
>> >> controlled via a new "rbd_io_scheduler" configuration
>> >> option.
>> >>
>> >> * RGW: radosgw-admin introduces two subcommands that allow the
>> >> managing of expire-stale objects that might be left behind after a
>> >> bucket reshard in earlier versions of RGW. One subcommand lists
>> such
>> >> objects and the other deletes them. Read the troubleshooting
>> section
>> >> of the dynamic resharding docs for details.
>> >>
>> >> * RGW: Bucket naming restrictions have changed and likely to cause
>> >> InvalidBucketName errors. We recommend to set
>> >> `rgw_relaxed_s3_bucket_names`
>> >> option to true as a workaround.
>> >>
>> >> * In the Zabbix Mgr Module there was a typo in the key being send
>> >> to Zabbix for PGs in backfill_wait state. The key that was sent
>> >> was 'wait_backfill' and the correct name is 'backfill_wait'.
>> >> Update your Zabbix template accordingly so that it accepts the
>> >> new key being send to Zabbix.
>> >>
>> >> * zabbix plugin for ceph manager now includes osd and pool
>> >> discovery. Update of zabbix_template.xml is needed
>> >> to receive per-pool (read/write throughput, diskspace usage)
>> >> and per-osd (latency, status, pgs) statistics
>> >>
>> >> * The format of all date + time stamps has been modified to fully
>> >> conform to ISO 8601. The old format (`YYYY-MM-DD
>> >> HH:MM:SS.ssssss`) excluded the `T` separator between the date and
>> >> time and was rendered using the local time zone without any
>> explicit
>> >> indication. The new format includes the separator as well as a
>> >> `+nnnn` or `-nnnn` suffix to indicate the time zone, or a `Z`
>> >> suffix if the time is UTC. For example,
>> >> `2019-04-26T18:40:06.225953+0100`.
>> >>
>> >> Any code or scripts that was previously parsing date and/or time
>> >> values from the JSON or XML structure CLI output should be checked
>> >> to ensure it can handle ISO 8601 conformant values. Any code
>> >> parsing date or time values from the unstructured human-readable
>> >> output should be modified to parse the structured output
>> instead, as
>> >> the human-readable output may change without notice.
>> >>
>> >> * The `bluestore_no_per_pool_stats_tolerance` config option has been
>> >> replaced with `bluestore_fsck_error_on_no_per_pool_stats`
>> >> (default: false). The overall default behavior has not changed:
>> >> fsck will warn but not fail on legacy stores, and repair will
>> >> convert to per-pool stats.
>> >>
>> >> * The disaster-recovery related 'ceph mon sync force' command has
>> been
>> >> replaced with 'ceph daemon <...> sync_force'.
>> >>
>> >> * The `osd_recovery_max_active` option now has
>> >> `osd_recovery_max_active_hdd` and `osd_recovery_max_active_ssd`
>> >> variants, each with different default values for HDD and SSD-backed
>> >> OSDs, respectively. By default `osd_recovery_max_active` now
>> >> defaults to zero, which means that the OSD will conditionally use
>> >> the HDD or SSD option values. Administrators who have customized
>> >> this value may want to consider whether they have set this to a
>> >> value similar to the new defaults (3 for HDDs and 10 for SSDs) and,
>> >> if so, remove the option from their configuration entirely.
>> >>
>> >> * monitors now have a `ceph osd info` command that will provide
>> information
>> >> on all osds, or provided osds, thus simplifying the process of
>> having to
>> >> parse `osd dump` for the same information.
>> >>
>> >> * The structured output of `ceph status` or `ceph -s` is now more
>> >> concise, particularly the `mgrmap` and `monmap` sections, and the
>> >> structure of the `osdmap` section has been cleaned up.
>> >>
>> >> * A health warning is now generated if the average osd heartbeat ping
>> >> time exceeds a configurable threshold for any of the intervals
>> >> computed. The OSD computes 1 minute, 5 minute and 15 minute
>> >> intervals with average, minimum and maximum values. New
>> >> configuration option `mon_warn_on_slow_ping_ratio` specifies a
>> >> percentage of `osd_heartbeat_grace` to determine the threshold. A
>> >> value of zero disables the warning. New configuration option
>> >> `mon_warn_on_slow_ping_time` specified in milliseconds over-rides
>> >> the computed value, causes a warning when OSD heartbeat pings take
>> >> longer than the specified amount. New admin command `ceph daemon
>> >> mgr.# dump_osd_network [threshold]` command will list all
>> >> connections with a ping time longer than the specified threshold or
>> >> value determined by the config options, for the average for any of
>> >> the 3 intervals. New admin command `ceph daemon osd.#
>> >> dump_osd_network [threshold]` will do the same but only including
>> >> heartbeats initiated by the specified OSD.
>> >>
>> >> * Inline data support for CephFS has been deprecated. When
>> setting the
>> >> flag,
>> >> users will see a warning to that effect, and enabling it now
>> requires the
>> >> `--yes-i-really-really-mean-it` flag. If the MDS is started on a
>> >> filesystem that has it enabled, a health warning is generated.
>> Support
>> >> for
>> >> this feature will be removed in a future release.
>> >>
>> >> * `ceph {set,unset} full` is not supported anymore. We have been
>> using
>> >> `full` and `nearfull` flags in OSD map for tracking the
>> fullness status
>> >> of a cluster back since the Hammer release, if the OSD map is
>> marked
>> >> `full`
>> >> all write operations will be blocked until this flag is
>> removed. In the
>> >> Infernalis release and Linux kernel 4.7 client, we introduced the
>> >> per-pool
>> >> full/nearfull flags to track the status for a finer-grained
>> control, so
>> >> the
>> >> clients will hold the write operations if either the
>> cluster-wide `full`
>> >> flag or the per-pool `full` flag is set. This was a compromise,
>> as we
>> >> needed to support the cluster with and without per-pool `full`
>> flags
>> >> support. But this practically defeated the purpose of
>> introducing the
>> >> per-pool flags. So, in the Mimic release, the new flags finally
>> took the
>> >> place of their cluster-wide counterparts, as the monitor
>> started removing
>> >> these two flags from OSD map. So the clients of Infernalis and
>> up can
>> >> benefit
>> >> from this change, as they won't be blocked by the full pools
>> which they
>> >> are
>> >> not writing to. In this release, `ceph {set,unset} full` is now
>> >> considered
>> >> as an invalid command. And the clients will continue honoring
>> both the
>> >> cluster-wide and per-pool flags to be backward comaptible with
>> >> pre-infernalis
>> >> clusters.
>> >>
>> >> * The telemetry module now reports more information.
>> >>
>> >> First, there is a new 'device' channel, enabled by default, that
>> >> will report anonymized hard disk and SSD health metrics to
>> >> telemetry.ceph.com <http://telemetry.ceph.com> in order to
>> build and improve device failure
>> >> prediction algorithms. If you are not comfortable sharing device
>> >> metrics, you can disable that channel first before re-opting-in::
>> >>
>> >> ceph config set mgr mgr/telemetry/channel_device false
>> >>
>> >> Second, we now report more information about CephFS file systems,
>> >> including:
>> >>
>> >> - how many MDS daemons (in total and per file system)
>> >> - which features are (or have been) enabled
>> >> - how many data pools
>> >> - approximate file system age (year + month of creation)
>> >> - how many files, bytes, and snapshots
>> >> - how much metadata is being cached
>> >>
>> >> We have also added:
>> >>
>> >> - which Ceph release the monitors are running
>> >> - whether msgr v1 or v2 addresses are used for the monitors
>> >> - whether IPv4 or IPv6 addresses are used for the monitors
>> >> - whether RADOS cache tiering is enabled (and which mode)
>> >> - whether pools are replicated or erasure coded, and
>> >> which erasure code profile plugin and parameters are in use
>> >> - how many hosts are in the cluster, and how many hosts have
>> each type
>> >> of daemon
>> >> - whether a separate OSD cluster network is being used
>> >> - how many RBD pools and images are in the cluster, and how
>> many pools
>> >> have RBD mirroring enabled
>> >> - how many RGW daemons, zones, and zonegroups are present;
>> which RGW
>> >> frontends are in use
>> >> - aggregate stats about the CRUSH map, like which algorithms
>> are used,
>> >> how
>> >> big buckets are, how many rules are defined, and what
>> tunables are in
>> >> use
>> >>
>> >> If you had telemetry enabled, you will need to re-opt-in with::
>> >>
>> >> ceph telemetry on
>> >>
>> >> You can view exactly what information will be reported first with::
>> >>
>> >> $ ceph telemetry show # see everything
>> >> $ ceph telemetry show basic # basic cluster info (including
>> all of
>> >> the new info)
>> >>
>> >> * Following invalid settings now are not tolerated anymore
>> >> for the command `ceph osd erasure-code-profile set xxx`.
>> >> * invalid `m` for "reed_sol_r6_op" erasure technique
>> >> * invalid `m` and invalid `w` for "liber8tion" erasure technique
>> >>
>> >> * New OSD daemon command dump_recovery_reservations which reveals the
>> >> recovery locks held (in_progress) and waiting in priority queues.
>> >>
>> >> * New OSD daemon command dump_scrub_reservations which reveals the
>> >> scrub reservations that are held for local (primary) and remote
>> >> (replica) PGs.
>> >>
>> >> * Previously, `ceph tell mgr ...` could be used to call commands
>> >> implemented by mgr modules. This is no longer supported. Since
>> >> luminous, using `tell` has not been necessary: those same commands
>> >> are also accessible without the `tell mgr` portion (e.g., `ceph
>> >> tell mgr influx foo` is the same as `ceph influx foo`. `ceph
>> >> tell mgr ...` will now call admin commands--the same set of
>> >> commands accessible via `ceph daemon ...` when you are logged into
>> >> the appropriate host.
>> >>
>> >> * The `ceph tell` and `ceph daemon` commands have been unified,
>> >> such that all such commands are accessible via either interface.
>> >> Note that ceph-mgr tell commands are accessible via either `ceph
>> >> tell mgr ...` or `ceph tell mgr.<id> ...`, and it is only
>> >> possible to send tell commands to the active daemon (the
>> standbys do
>> >> not accept incoming connections over the network).
>> >>
>> >> * Ceph will now issue a health warning if a RADOS pool as a `pg_num`
>> >> value that is not a power of two. This can be fixed by adjusting
>> >> the pool to a nearby power of two::
>> >>
>> >> ceph osd pool set <pool-name> pg_num <new-pg-num>
>> >>
>> >> Alternatively, the warning can be silenced with::
>> >>
>> >> ceph config set global
>> mon_warn_on_pool_pg_num_not_power_of_two false
>> >>
>> >> * The format of MDSs in `ceph fs dump` has changed.
>> >>
>> >> * The `mds_cache_size` config option is completely removed. Since
>> luminous,
>> >> the `mds_cache_memory_limit` config option has been preferred to
>> >> configure
>> >> the MDS's cache limits.
>> >>
>> >> * The `pg_autoscale_mode` is now set to `on` by default for newly
>> >> created pools, which means that Ceph will automatically manage the
>> >> number of PGs. To change this behavior, or to learn more about PG
>> >> autoscaling, see :ref:`pg-autoscaler`. Note that existing pools in
>> >> upgraded clusters will still be set to `warn` by default.
>> >>
>> >> * The `upmap_max_iterations` config option of mgr/balancer has been
>> >> renamed to `upmap_max_optimizations` to better match its behaviour.
>> >>
>> >> * `mClockClientQueue` and `mClockClassQueue` OpQueue
>> >> implementations have been removed in favor of of a single
>> >> `mClockScheduler` implementation of a simpler OSD interface.
>> >> Accordingly, the `osd_op_queue_mclock*` family of config options
>> >> has been removed in favor of the `osd_mclock_scheduler*` family
>> >> of options.
>> >>
>> >> * The config subsystem now searches dot ('.') delimited prefixes for
>> >> options. That means for an entity like `client.foo.bar`, its
>> >> overall configuration will be a combination of the global options,
>> >> `client`, `client.foo`, and `client.foo.bar`. Previously,
>> >> only global, `client`, and `client.foo.bar` options would apply.
>> >> This change may affect the configuration for clients that include a
>> >> `.` in their name.
>> >>
>> >> Getting Ceph
>> >> ------------
>> >> * Git at git://github.com/ceph/ceph.git
>> <http://github.com/ceph/ceph.git>
>> >> * Tarball at http://download.ceph.com/tarballs/ceph-15.2.0.tar.gz
>> >> * For packages, see
>> http://docs.ceph.com/docs/master/install/get-packages/
>> >> * Release git sha1: dc6a0b5c3cbf6a5e1d6d4f20b5ad466d76b96247
>> >>
>> >> --
>> >> Abhishek Lekshmanan
>> >> SUSE Software Solutions Germany GmbH
>> >> GF: Felix Imendörffer
>> >> _______________________________________________
>> >> ceph-users mailing list -- ceph-users(a)ceph.io
>> <mailto:ceph-users@ceph.io>
>> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>> <mailto:ceph-users-leave@ceph.io>
>> >>
>>
>> --
>> Abhishek Lekshmanan
>> SUSE Software Solutions Germany GmbH
>> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
>>
Hi Abhi,
15.2.0 is pushed to download.ceph.com. I merged the last 2 doc PRs
targetted for octopus and merged that back into master. Your PR targets
master, so that'll need a cherry-pick/backport to octopus branch after it
merges.
The v15.2[.0] tags aren't up on docker hub yet.. I'm not sure when the
nightly cron for that runs.
I think the only other thing left is to push out the email and blogs...
want to go ahead and do that in the morning?
Yay!
sage
Dear Community
Apologize for the lengthy email, but wanted to share some thoughts I have
regarding our upstream test process (after being in the project for ~18
month).
TL;DR: execute teuthology tests directly from jenkins
Currently, during the PR submission process jenkins build ceph for one
target (ubuntu18.04) and then run the unit tests suite.
Due to the nature of ceph, unit tests are covering only part of the code,
and therefore, the author is expected to run the changes against teuthology
as well.
Current process for running teuthology has has several drawbacks:
- you have to submit you changes as a branch into the ceph-ci repo, this
kicks multiple builds for multiple targets which usually takes a couple of
hours (1-4) to finish. testing can only start after the images are ready
- even though the code in the PR is merged to the latest version of the
branch you are merging to, the images tested in teuthology are using your
branch, and whether it was rebased recently or not is up to the author
- you have to execute a teuthology test suite against these images, the
suites are pretty big, and run against several targets (very similar to our
nightlies). this means that the entire process takes hours-days to finish
- there is no guarantee that the code being tested in teuthology is the
same code being merged in the PR
- the manual analysis of the teuthology results is time consuming and error
prone, as you have to figure out which tests didn't run due to
infrastructure issues, which fails are expected and in the process of
fixing etc.
As a result, developers, instead of treating teuthology as a valuable
verification tools, try to avoid it :-(
Often, running teuthology and analyzing the results, is done by a handful
of experienced developers.
Suggestion:
In many projects, system and integration testing are run from jenkins
automatically - would argue that this could be the case in our project as
well.
- select one target, similar to what we did with unit testing. the target
is automatically built in jenkins
- create smaller "sanity" suits in teuthology that would allow for faster
execution, and would test only the select target
- jenkins can use teuthology and the sepia lab, so no changes would be
needed for actually running the tests
- to avoid extensive load on the sepia lab, the actual triggering could be
manual (e.g. "jenkins test rgw sanity"). This means that the author can
select which area of the code to test, and would execute it when the PR is
in good enough shape for testing
- the results could be automatically analyzed against infrastructure
issues, tests issues etc. the knowledge needed for this analysis could be
coded and updated in jenkins
- in some cases, manual execution of specific tests or against specific
targets will be needed, and this could be done the same way it is currently
done
in the future...
- we may also automate the process of selecting which tests needs to run,
according to the files being modified in the PR
- gating PRs on passing teuthology runs would serve as a tool for better
quality code. but also for more stable tests and infrastructure
Appreciate your feedback!
Yuval
I opened a pull request https://github.com/ceph/ceph/pull/33539 with a
design doc for dynamic resharding in multisite. Please review and give
feedback, either here or in PR comments!
Checking the word "Octopus" in different languages the only one starting
with a "Q" is in "Maltese": "Qarnit"
For good measure here is a Maltesian Qarnit stew recipe:
http://littlerock.com.mt/food/maltese-traditional-recipe-stuffat-tal-qarnit…
Respectfully,
*Wes Dillingham*
wes(a)wesdillingham.com
LinkedIn <http://www.linkedin.com/in/wesleydillingham>
On Mon, Mar 23, 2020 at 1:32 PM Brian Topping <brian.topping(a)gmail.com>
wrote:
> I liked the first one a lot. Until I read the second one.
>
> > On Mar 23, 2020, at 11:29 AM, Anthony D'Atri <anthony.datri(a)gmail.com>
> wrote:
> >
> > That has potential. Another, albeit suboptimal idea would be simply
> >
> > Quid
> >
> > as in
> >
> > ’S quid
> >
> > as in “it’s squid”. cf. https://en.wikipedia.org/wiki/%27S_Wonderful
> >
> > Alternately just skip to R and when someone tasks about Q, we say “The
> first rule of Ceph is that we don’t talk about Q”.
> >
> > — aad
> >
> >>
> >> How about the squid-headed alien species from Star Wars?
> >>
> >>
> https://en.wikipedia.org/wiki/List_of_Star_Wars_species_(P%E2%80%93T)#Quarr…
> >>
> >>
> >>
> >>
> >> On Mon, Mar 23, 2020 at 6:11 PM Sage Weil <sweil(a)redhat.com> wrote:
> >>>
> >>> Hi everyone,
> >>>
> >>> As we wrap up Octopus and kick of development for Pacific, now it seems
> >>> like a good idea to sort out what to call the Q release.
> >>> Traditionally/historically, these have always been names of cephalopod
> >>> species--usually the "common name", but occasionally a latin name
> >>> (infernalis).
> >>>
> >>> Q is a bit of a challenge since there aren't many of either that start
> >>> with Q. Nick Barcet found one: quebecoceras, an extinct genus of
> nautilus
> >>> (https://en.wikipedia.org/wiki/Quebecoceras).
> >>>
> >>> The only other Q cephalopod reference I could find was Squidward Q
> >>> Tentacles, a character (octopus, strangely) from Spongebob Squarepants,
> >>> and Yehuda figured out that the Q stands for Quincy.
> >>>
> >>> So far that's it. If you can find any other options, please catalog
> them
> >>> on the etherpad:
> >>>
> >>> https://pad.ceph.com/p/q
> >>>
> >>> (or even get a head start on future releases.. they're always the
> >>> single-letter pads, e.g., https://pad.ceph.com/p/r).
> >>>
> >>> sage
> >>> _______________________________________________
> >>> Dev mailing list -- dev(a)ceph.io
> >>> To unsubscribe send an email to dev-leave(a)ceph.io
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users(a)ceph.io
> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
there's always Quahog here in New England, but I like Quincy.
On Mon, Mar 23, 2020 at 1:13 PM Brian Topping <brian.topping(a)gmail.com>
wrote:
> Maybe just call it Quincy and have a backstory? Might be fun...
>
> > On Mar 23, 2020, at 11:11 AM, Sage Weil <sweil(a)redhat.com> wrote:
> >
> > Hi everyone,
> >
> > As we wrap up Octopus and kick of development for Pacific, now it seems
> > like a good idea to sort out what to call the Q release.
> > Traditionally/historically, these have always been names of cephalopod
> > species--usually the "common name", but occasionally a latin name
> > (infernalis).
> >
> > Q is a bit of a challenge since there aren't many of either that start
> > with Q. Nick Barcet found one: quebecoceras, an extinct genus of
> nautilus
> > (https://en.wikipedia.org/wiki/Quebecoceras).
> >
> > The only other Q cephalopod reference I could find was Squidward Q
> > Tentacles, a character (octopus, strangely) from Spongebob Squarepants,
> > and Yehuda figured out that the Q stands for Quincy.
> >
> > So far that's it. If you can find any other options, please catalog
> them
> > on the etherpad:
> >
> > https://pad.ceph.com/p/q
> >
> > (or even get a head start on future releases.. they're always the
> > single-letter pads, e.g., https://pad.ceph.com/p/r).
> >
> > sage
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>