On Fri, 23 Jun 2017, Abhishek L wrote:
> This is the first release candidate for Luminous, the next long term
> stable release.
I just want to reiterate that this is a release candidate, not the final
luminous release. We're still squashing bugs and merging a few last
items. Testing is welcome, but you probably should not deploy this in any
production environments.
Thanks!
sage
> Ceph Luminous will be the foundation for the next long-term
> stable release series. There have been major changes since Kraken
> (v11.2.z) and Jewel (v10.2.z).
>
> Major Changes from Kraken
> -------------------------
>
> - *General*:
>
> * Ceph now has a simple, built-in web-based dashboard for monitoring
> cluster status.
>
> - *RADOS*:
>
> * *BlueStore*:
>
> - The new *BlueStore* backend for *ceph-osd* is now stable and the new
> default for newly created OSDs. BlueStore manages data stored by each OSD
> by directly managing the physical HDDs or SSDs without the use of an
> intervening file system like XFS. This provides greater performance
> and features.
> - BlueStore supports *full data and metadata checksums* of all
> data stored by Ceph.
> - BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph
> also supports zstd for RGW compression but zstd is not recommended for
> BlueStore for performance reasons.)
>
> * *Erasure coded* pools now have full support for *overwrites*,
> allowing them to be used with RBD and CephFS.
>
> * *ceph-mgr*:
>
> - There is a new daemon, *ceph-mgr*, which is a required part of any
> Ceph deployment. Although IO can continue when *ceph-mgr* is
> down, metrics will not refresh and some metrics-related calls
> (e.g., ``ceph df``) may block. We recommend deploying several instances of
> *ceph-mgr* for reliability. See the notes on `Upgrading`_ below.
> - The *ceph-mgr* daemon includes a REST-based management API. The
> API is still experimental and somewhat limited but will form the basis
> for API-based management of Ceph going forward.
>
> * The overall *scalability* of the cluster has improved. We have
> successfully tested clusters with up to 10,000 OSDs.
> * Each OSD can now have a *device class* associated with it (e.g., `hdd` or
> `ssd`), allowing CRUSH rules to trivially map data to a subset of devices
> in the system. Manually writing CRUSH rules or manual editing of the CRUSH
> is normally not required.
> * You can now *optimize CRUSH weights* can now be optimized to
> maintain a *near-perfect distribution of data* across OSDs.
> * There is also a new `upmap` exception mechanism that allows
> individual PGs to be moved around to achieve a *perfect
> distribution* (this requires luminous clients).
> * Each OSD now adjusts its default configuration based on whether the
> backing device is an HDD or SSD. Manual tuning generally not required.
> * The prototype *mclock QoS queueing algorithm* is now available.
> * There is now a *backoff* mechanism that prevents OSDs from being
> overloaded by requests to objects or PGs that are not currently able to
> process IO.
> * There is a *simplified OSD replacement process* that is more robust.
> * You can query the supported features and (apparent) releases of
> all connected daemons and clients with ``ceph features``.
> * You can configure the oldest Ceph client version you wish to allow to
> connect to the cluster via ``ceph osd set-require-min-compat-client`` and
> Ceph will prevent you from enabling features that will break compatibility
> with those clients.
> * Several `sleep` settings, include ``osd_recovery_sleep``,
> ``osd_snap_trim_sleep``, and ``osd_scrub_sleep`` have been
> reimplemented to work efficiently. (These are used in some cases
> to work around issues throttling background work.)
>
> - *RGW*:
>
> * RGW *metadata search* backed by ElasticSearch now supports end
> user requests service via RGW itself, and also supports custom
> metadata fields. A query language a set of RESTful APIs were
> created for users to be able to search objects by their
> metadata. New APIs that allow control of custom metadata fields
> were also added.
> * RGW now supports *dynamic bucket index sharding*. As the number
> of objects in a bucket grows, RGW will automatically reshard the
> bucket index in response. No user intervention or bucket size
> capacity planning is required.
> * RGW introduces *server side encryption* of uploaded objects with
> three options for the management of encryption keys: automatic
> encryption (only recommended for test setups), customer provided
> keys similar to Amazon SSE-C specification, and through the use of
> an external key management service (Openstack Barbician) similar
> to Amazon SSE-KMS specification.
> * RGW now has preliminary AWS-like bucket policy API support. For
> now, policy is a means to express a range of new authorization
> concepts. In the future it will be the founation for additional
> auth capabilities such as STS and group policy.
> * RGW has consolidated the several metadata index pools via the use of rados
> namespaces.
>
> - *RBD*:
>
> * RBD now has full, stable support for *erasure coded pools* via the new
> ``--data-pool`` option to ``rbd create``.
> * RBD mirroring's rbd-mirror daemon is now highly available. We
> recommend deploying several instances of rbd-mirror for
> reliability.
> * The default 'rbd' pool is no longer created automatically during
> cluster creation. Additionally, the name of the default pool used
> by the rbd CLI when no pool is specified can be overridden via a
> new ``rbd default pool = <pool name>`` configuration option.
> * Initial support for deferred image deletion via new ``rbd
> trash`` CLI commands. Images, even ones actively in-use by
> clones, can be moved to the trash and deleted at a later time.
> * New pool-level ``rbd mirror pool promote`` and ``rbd mirror pool
> demote`` commands to batch promote/demote all mirrored images
> within a pool.
> * Mirroring now optionally supports a configurable replication delay
> via the ``rbd mirroring replay delay = <seconds>`` configuration
> option.
> * Improved discard handling when the object map feature is enabled.
> * rbd CLI ``import`` and ``copy`` commands now detect sparse and
> preserve sparse regions.
> * Snapshots will now include a creation timestamp
>
> - *CephFS*:
>
> * *Multiple active MDS daemons* is now considered stable. The number
> of active MDS servers may be adjusted up or down on an active CephFS file
> system.
> * CephFS *directory fragmentation* is now stable and enabled by
> default on new filesystems. To enable it on existing filesystems
> use "ceph fs set <fs_name> allow_dirfrags". Large or very busy
> directories are sharded and (potentially) distributed across
> multiple MDS daemons automatically.
> * Directory subtrees can be explicitly pinned to specific MDS daemons in
> cases where the automatic load balancing is not desired or effective.
>
> - *Miscellaneous*:
>
> * Release packages are now being built for *Debian Stretch*. The
> distributions we build for now includes:
>
> - CentOS 7 (x86_64 and aarch64)
> - Debian 8 Jessie (x86_64)
> - Debian 9 Stretch (x86_64)
> - Ubuntu 16.04 Xenial (x86_64 and aarch64)
> - Ubuntu 14.04 Trusty (x86_64)
>
> Note that QA is limited to CentOS and Ubuntu (xenial and trusty).
>
> * *CLI changes*:
>
> - The ``ceph -s`` or ``ceph status`` command has a fresh look.
> - ``ceph {osd,mds,mon} versions`` summarizes versions of running daemons.
> - ``ceph {osd,mds,mon} count-metadata <property>`` similarly
> tabulates any other daemon metadata visible via the ``ceph
> {osd,mds,mon} metadata`` commands.
> - ``ceph features`` summarizes features and releases of connected
> clients and daemons.
> - ``ceph osd require-osd-release <release>`` replaces the old
> ``require_RELEASE_osds`` flags.
> - ``ceph osd pg-upmap``, ``ceph osd rm-pg-upmap``, ``ceph osd
> pg-upmap-items``, ``ceph osd rm-pg-upmap-items`` can explicitly
> manage `upmap` items.
> - ``ceph osd getcrushmap`` returns a crush map version number on
> stderr, and ``ceph osd setcrushmap [version]`` will only inject
> an updated crush map if the version matches. This allows crush
> maps to be updated offline and then reinjected into the cluster
> without fear of clobbering racing changes (e.g., by newly added
> osds or changes by other administrators).
> - ``ceph osd create`` has been replaced by ``ceph osd new``. This
> should be hidden from most users by user-facing tools like
> `ceph-disk`.
> - ``ceph osd destroy`` will mark an OSD destroyed and remove its
> cephx and lockbox keys. However, the OSD id and CRUSH map entry
> will remain in place, allowing the id to be reused by a
> replacement device with minimal data rebalancing.
> - ``ceph osd purge`` will remove all traces of an OSD from the
> cluster, including its cephx encryption keys, dm-crypt lockbox
> keys, OSD id, and crush map entry.
> - ``ceph osd ls-tree <name>`` will output a list of OSD ids under
> the given CRUSH name (like a host or rack name). This is useful
> for applying changes to entire subtrees. For example, ``ceph
> osd down `ceph osd ls-tree rack1```.
> - ``ceph osd {add,rm}-{noout,noin,nodown,noup}`` allow the
> `noout`, `nodown`, `noin`, and `noup` flags to be applied to
> specific OSDs.
> - ``ceph log last [n]`` will output the last *n* lines of the cluster
> log.
> - ``ceph mgr dump`` will dump the MgrMap, including the currently active
> ceph-mgr daemon and any standbys.
> - ``ceph osd crush swap-bucket <src> <dest>`` will swap the
> contents of two CRUSH buckets in the hierarchy while preserving
> the buckets' ids. This allows an entire subtree of devices to
> be replaced (e.g., to replace an entire host of FileStore OSDs
> with newly-imaged BlueStore OSDs) without disrupting the
> distribution of data across neighboring devices.
> - ``ceph osd set-require-min-compat-client <release>`` configures
> the oldest client release the cluster is required to support.
> Other changes, like CRUSH tunables, will fail with an error if
> they would violate this setting. Changing this setting also
> fails if clients older than the specified release are currently
> connected to the cluster.
> - ``ceph config-key dump`` dumps config-key entries and their
> contents. (The exist ``ceph config-key ls`` only dumps the key
> names, not the values.)
> - ``ceph osd set-{full,nearfull,backfillfull}-ratio`` sets the
> cluster-wide ratio for various full thresholds (when the cluster
> refuses IO, when the cluster warns about being close to full,
> when an OSD will defer rebalancing a PG to itself,
> respectively).
> - ``ceph osd reweightn`` will specify the `reweight` values for
> multiple OSDs in a single command. This is equivalent to a series of
> ``ceph osd reweight`` commands.
> - ``ceph crush class {create,rm,ls}`` manage the new CRUSH *device
> class* feature. ``ceph crush set-device-class <osd> <class>``
> will set the clas for a particular device.
> - ``ceph mon feature ls`` will list monitor features recorded in the
> MonMap. ``ceph mon feature set`` will set an optional feature (none of
> these exist yet).
>
> Major Changes from Jewel
> ------------------------
>
> - *RADOS*:
>
> * We now default to the AsyncMessenger (``ms type = async``) instead
> of the legacy SimpleMessenger. The most noticeable difference is
> that we now use a fixed sized thread pool for network connections
> (instead of two threads per socket with SimpleMessenger).
> * Some OSD failures are now detected almost immediately, whereas
> previously the heartbeat timeout (which defaults to 20 seconds)
> had to expire. This prevents IO from blocking for an extended
> period for failures where the host remains up but the ceph-osd
> process is no longer running.
> * The size of encoded OSDMaps has been reduced.
> * The OSDs now quiesce scrubbing when recovery or rebalancing is in progress.
>
> - *RGW*:
>
> * RGW now supports the S3 multipart object copy-part API.
> * It is possible now to reshard an existing bucket offline. Offline
> bucket resharding currently requires that all IO (especially
> writes) to the specific bucket is quiesced. (For automatic online
> resharding, see the new feature in Luminous above.)
> * RGW now supports data compression for objects.
> * Civetweb version has been upgraded to 1.8
> * The Swift static website API is now supported (S3 support has been added
> previously).
> * S3 bucket lifecycle API has been added. Note that currently it only supports
> object expiration.
> * Support for custom search filters has been added to the LDAP auth
> implementation.
> * Support for NFS version 3 has been added to the RGW NFS gateway.
> * A Python binding has been created for librgw.
>
> - *RBD*:
>
> * The rbd-mirror daemon now supports replicating dynamic image
> feature updates and image metadata key/value pairs from the
> primary image to the non-primary image.
> * The number of image snapshots can be optionally restricted to a
> configurable maximum.
> * The rbd Python API now supports asynchronous IO operations.
>
> - *CephFS*:
>
> * libcephfs function definitions have been changed to enable proper
> uid/gid control. The library version has been increased to reflect the
> interface change.
> * Standby replay MDS daemons now consume less memory on workloads
> doing deletions.
> * Scrub now repairs backtrace, and populates `damage ls` with
> discovered errors.
> * A new `pg_files` subcommand to `cephfs-data-scan` can identify
> files affected by a damaged or lost RADOS PG.
> * The false-positive "failing to respond to cache pressure" warnings have
> been fixed.
>
> For more details refer to the detailed blog entry at
> http://ceph.com/releases/v12-1-0-luminous-rc-released/
>
> * Git at git://github.com/ceph/ceph.git
> * Tarball at http://download.ceph.com/tarballs/ceph-12.1.0.tar.gz
> * For packages, see http://docs.ceph.com/docs/master/install/get-packages/
> * For ceph-deploy, see http://docs.ceph.com/docs/master/install/install-ceph-deploy
> * Release sha1: 262617c9f16c55e863693258061c5b25dea5b086
>
> --
> Abhishek Lekshmanan
> SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>