This is the eighth backport release in the Ceph Mimic stable release
series. Its sole purpose is to fix a regression that found its way into
the previous release.
Notable Changes
---------------
* Due to a missed backport, clusters in the process of being upgraded from
13.2.6 to 13.2.7 might suffer an OSD crash in build_incremental_map_msg.
This regression was reported in https://tracker.ceph.com/issues/43106
and is fixed in 13.2.8 (this release). Users of 13.2.6 can upgrade to 13.2.8
directly - i.e., skip 13.2.7 - to avoid this.
Changelog
---------
* osd: fix sending incremental map messages (issue#43106 pr#32000, Sage Weil)
* tests: added missing point release versions (pr#32087, Yuri Weinstein)
* tests: rgw: add missing force-branch: ceph-mimic for swift tasks (pr#32033, Casey Bodley)
For a blog with links to PRs and issues please check out
https://ceph.io/releases/v13-2-8-mimic-released/
Getting Ceph
------------
* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-13.2.8.tar.gz
* For packages, see http://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: 5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0
--
Abhishek Lekshmanan
SUSE Software Solutions Germany GmbH
Hi,
Trying to create a new OSD following the instructions available at
https://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/
On step 3 I'm instructed to run "ceph-osd -i {osd-num} --mkfs
--mkkey". Unfortunately it doesn't work:
# ceph-osd -i 3 --mkfs --mkkey
2019-12-11 16:59:58.257 7fac4597fc00 -1 auth: unable to find a keyring
on /var/lib/ceph/osd/ceph-3/keyring: (2) No such file or directory
2019-12-11 16:59:58.257 7fac4597fc00 -1 AuthRegistry(0x55ad976ea140)
no keyring found at /var/lib/ceph/osd/ceph-3/keyring, disabling cephx
2019-12-11 16:59:58.261 7fac4597fc00 -1 auth: unable to find a keyring
on /var/lib/ceph/osd/ceph-3/keyring: (2) No such file or directory
2019-12-11 16:59:58.261 7fac4597fc00 -1 AuthRegistry(0x7fffac4075e8)
no keyring found at /var/lib/ceph/osd/ceph-3/keyring, disabling cephx
failed to fetch mon config (--no-mon-config to skip)
Shouldn't it create the keyring? Why is it complaining about not being
able to find a keyring?
Regards,
Rodrigo
Hi Cephers,
To better understand how our current users utilize Ceph, we conducted a
public community survey. This information is a guide to the community of
how we spend our contribution efforts for future development. The survey
results will remain anonymous and aggregated in future Ceph Foundation
publications to the community.
I'm pleased to announce after much discussion on the Ceph dev mailing list
[0] that the community has formed the Ceph Survey for 2019.
The deadline for this survey due to it being out later than we'd like will
be January 31st, 2020 at 11:59 PT.
https://ceph.io/user-survey/
We have discussed in the future to use the Ceph telemetry module to collect
the data to save time for our users. Please let me know of any mistakes
that need to be corrected on the survey. Thanks!
[0] -
https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/WU374ZJP5N3NKY22X2…
--
Mike Perez
he/him
Ceph Community Manager
M: +1-951-572-2633
494C 5D25 2968 D361 65FB 3829 94BC D781 ADA8 8AEA
@Thingee <https://twitter.com/thingee> Thingee
<https://www.linkedin.com/thingee> <https://www.facebook.com/RedHatInc>
<https://www.redhat.com>
Philippe,
I have a master branch version of the code to test. The nautilus
backport https://github.com/ceph/ceph/pull/31956 should be the same.
Using your OSDMap, the code in master branch and some additional changes
to osdmaptool I was able to balance your cluster. The osdmaptool
changes simulate the mgr active balancer behavior. It never took no
more than 0.13991 seconds to calculate more upmaps per round. And that's
on a virtual machine used for development. It took 35 rounds with 10
maximum upmaps per crush rule set of pools per round. With the default
1 minute sleeps inside the mgr it would take 35 minutes. Obviously,
recovery/backfill has to finish before the cluster settles into the new
configuration. It needed 397 additional upmaps and removed 8.
Because all pools for a given crush rule are balanced together you can
see that this is more balanced than Rich's configuration uising Luminous.
This balancer code is subject to change before final release of the next
Nautilus point release.
Final layout:
osd.0 pgs 146
osd.1 pgs 146
osd.2 pgs 146
osd.3 pgs 146
osd.4 pgs 146
osd.5 pgs 146
osd.6 pgs 146
osd.7 pgs 146
osd.8 pgs 146
osd.9 pgs 146
osd.10 pgs 146
osd.11 pgs 146
osd.12 pgs 74
osd.13 pgs 74
osd.14 pgs 73
osd.15 pgs 74
osd.16 pgs 74
osd.17 pgs 74
osd.18 pgs 73
osd.19 pgs 74
osd.20 pgs 73
osd.21 pgs 73
osd.22 pgs 74
osd.23 pgs 73
osd.24 pgs 73
osd.25 pgs 75
osd.26 pgs 74
osd.27 pgs 74
osd.28 pgs 73
osd.29 pgs 73
osd.30 pgs 73
osd.31 pgs 73
osd.32 pgs 74
osd.33 pgs 73
osd.34 pgs 73
osd.35 pgs 74
osd.36 pgs 74
osd.37 pgs 74
osd.38 pgs 74
osd.39 pgs 74
osd.40 pgs 73
osd.41 pgs 73
osd.42 pgs 73
osd.43 pgs 73
osd.44 pgs 74
osd.45 pgs 73
osd.46 pgs 73
osd.47 pgs 73
osd.48 pgs 73
osd.49 pgs 73
osd.50 pgs 73
osd.51 pgs 73
osd.52 pgs 75
osd.53 pgs 59
osd.54 pgs 74
osd.55 pgs 74
osd.56 pgs 74
osd.57 pgs 73
osd.58 pgs 74
osd.59 pgs 74
osd.60 pgs 74
osd.61 pgs 74
osd.62 pgs 73
osd.63 pgs 74
osd.64 pgs 73
osd.65 pgs 74
osd.66 pgs 74
osd.67 pgs 74
osd.68 pgs 73
osd.69 pgs 74
osd.70 pgs 73
osd.71 pgs 73
osd.72 pgs 73
osd.73 pgs 73
osd.74 pgs 73
osd.75 pgs 73
osd.76 pgs 73
osd.77 pgs 73
osd.78 pgs 73
osd.79 pgs 73
osd.80 pgs 73
osd.81 pgs 73
osd.82 pgs 73
osd.83 pgs 73
osd.84 pgs 73
osd.85 pgs 73
osd.86 pgs 73
osd.87 pgs 73
osd.88 pgs 73
osd.89 pgs 73
osd.90 pgs 73
osd.91 pgs 73
osd.92 pgs 73
osd.93 pgs 73
osd.94 pgs 73
osd.95 pgs 73
osd.96 pgs 73
osd.97 pgs 73
osd.98 pgs 73
osd.99 pgs 73
osd.100 pgs 146
osd.101 pgs 146
osd.102 pgs 146
osd.103 pgs 146
osd.104 pgs 146
osd.105 pgs 146
osd.106 pgs 146
osd.107 pgs 146
osd.108 pgs 146
osd.109 pgs 146
osd.110 pgs 146
osd.111 pgs 146
osd.112 pgs 73
osd.113 pgs 73
osd.114 pgs 73
osd.115 pgs 73
osd.116 pgs 73
osd.117 pgs 73
osd.118 pgs 73
osd.119 pgs 73
osd.120 pgs 73
osd.121 pgs 73
osd.122 pgs 73
osd.123 pgs 73
osd.124 pgs 73
osd.125 pgs 73
osd.126 pgs 73
osd.127 pgs 74
osd.128 pgs 73
osd.129 pgs 73
osd.130 pgs 73
osd.131 pgs 73
osd.132 pgs 73
osd.133 pgs 73
osd.134 pgs 73
osd.135 pgs 73
David
On 12/10/19 9:59 PM, Philippe D'Anjou wrote:
> Given I was told its an issue of too low PGs I am raising and testing
> this, although my SSDs which have about 150 each also are not well
> distributed.
> I attached my OSDmap, I'd appreciate if you could run your test on it
> like you did with the other guy, so I know if this will ever
> distribute equally or not..
>
> If you're busy I understand that too, then ignore this.
>
> Thanks in either case. Just have been dealing with this since months
> now and it gets frustrating.
>
> best regards
>
> Am Dienstag, 10. Dezember 2019, 03:53:17 OEZ hat David Zafman
> <dzafman(a)redhat.com> Folgendes geschrieben:
>
>
>
> Please file a tracker with the symptom and examples. Please attach your
> OSDMap (ceph osd getmap > osdmap.bin).
>
> Note that https://github.com/ceph/ceph/pull/31956
> <https://github.com/ceph/ceph/pull/31956 >has the Nautilus
> version of improved upmap code. It also changes osdmaptool to match the
> mgr behavior, so that one can observe the behavior of the upmap balancer
> offline.
>
> Thanks
>
> David
>
> On 12/8/19 11:04 AM, Philippe D'Anjou wrote:
> > It's only getting worse after raising PGs now.
> >
> > Anything between:
> > 96 hdd 9.09470 1.00000 9.1 TiB 4.9 TiB 4.9 TiB 97 KiB 13 GiB 4.2
> > TiB 53.62 0.76 54 up
> >
> > and
> >
> > 89 hdd 9.09470 1.00000 9.1 TiB 8.1 TiB 8.1 TiB 88 KiB 21 GiB 1001
> > GiB 89.25 1.27 87 up
> >
> > How is that possible? I dont know how much more proof I need to
> > present that there's a bug.
>
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users(a)lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
Hi!
We have a ceph cluster with 42 OSD in production as a server providing
mainly home-directories of users. Ceph is 14.2.4 nautilus.
We have 3 pools. One images (for rbd images) a cephfs_metadata and a
cephfs_data pool.
Our raw data is about 5.6T. All pools have replica size 3 and there are
only very little snapshots in the rbd images pool, the cephfspool doesnt
use snapshots.
How is it possible that the status tells us, that 21T/46T is used,
because thats much more than 3 times the raw size.
Also, to make that more confusing, there as at least half of the cluster
free, and we get pg backfill_toofull after we added some OSDs lately.
The ceph dashboard tells aus the pool ist 82 % full and has only 4.5 T
free.
The autoscale module seems to calculate the 20T times 3 for the space
needed and thus has wrong numbers (see below).
Status of the cluster is added below too.
how can these size/capacity numbers be explained?
and, would be there a recommendation to change something?
Thank you in advance!
best
Jochen
# ceph -s
cluster:
id: 2b16167f-3f33-4580-a0e9-7a71978f403d
health: HEALTH_ERR
Degraded data redundancy (low space): 1 pg backfill_toofull
1 subtrees have overcommitted pool target_size_bytes
1 subtrees have overcommitted pool target_size_ratio
2 pools have too many placement groups
services:
mon: 4 daemons, quorum jade,assam,matcha,jasmine (age 2d)
mgr: earl(active, since 24h), standbys: assam
mds: cephfs:1 {0=assam=up:active} 1 up:standby
osd: 42 osds: 42 up (since 106m), 42 in (since 115m); 30 remapped pgs
data:
pools: 3 pools, 2048 pgs
objects: 29.80M objects, 5.6 TiB
usage: 21 TiB used, 25 TiB / 46 TiB avail
pgs: 1164396/89411013 objects misplaced (1.302%)
2018 active+clean
22 active+remapped+backfill_wait
7 active+remapped+backfilling
1 active+remapped+backfill_wait+backfill_toofull
io:
client: 1.7 KiB/s rd, 516 KiB/s wr, 0 op/s rd, 28 op/s wr
recovery: 9.2 MiB/s, 41 objects/s
# ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO
TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
images 354.2G 3.0 46100G 0.0231
1.0 1024 32 warn
cephfs_metadata 13260M 3.0 595.7G 0.0652
1.0 512 8 warn
cephfs_data 20802G 3.0 46100G 1.3537
1.0 512 warn
I have a strange problem with ceph fs and extended attributes. I have two Centos machines where I mount cephfs in exactly the same way (I manually executed the exact same mount command on both machines). On one of the machines, getfattr returns this:
[root@ceph-01 ~]# getfattr -d -m 'ceph.*' /mnt/cephfs/hpc/home
getfattr: Removing leading '/' from absolute path names
# file: mnt/cephfs/hpc/home
ceph.dir.entries="49"
ceph.dir.files="1"
ceph.dir.rbytes="77816237666910"
ceph.dir.rctime="1575978038.0976848840"
ceph.dir.rentries="6673312"
ceph.dir.rfiles="6271408"
ceph.dir.rsubdirs="401904"
ceph.dir.subdirs="48"
and on the other I get nothing:
[root@gnosis ~]# getfattr -d -m 'ceph.*' /mnt/cephfs/hpc/home
No error message, just nothing.
The only difference is, that ceph-01 was kickstarted with Centos7.6 while gnosis was kickstarted with Centos7.7. Otherwise, both machines are deployed identically. getfattr is the same version on both. Kernel versions are ceph-01:5.0.2-1.el7.elrepo.x86_64 and gnosis:5.4.2-1.el7.elrepo.x86_64.
Does anyone have a pointer what to look for?
Thanks!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Hi,
Shouldn't Ceph's documentation be presented "per version"?
I believe there might be documentation for Ceph per version but I
can't see in Ceph documentation site how to easily see each version's
docs.
Regards,
Rodrigo Severo
Hi everyone,
The next Cephalocon is coming up on March 3-5 in Seoul! The CFP is open
until Friday (get your talks in!). We expect to have the program
ready for the first week of January. Registration (early bird) will be
available soon.
We're also looking for sponsors for the conference. The prospectus is
available here:
https://ceph.io/wp-content/uploads/2019/12/sponsor-Cephalocon20-112719.pdf
Thanks!
sage
Hi,
If I change the storage class of an object via s3cmd, the object's
storage class is reported as being changed. However, when inspecting
where the objects are placed (via `rados -p <pool> ls`, see further on),
the object seems to be retained in the original pool.
The idea behind this test setup is to simulate two storage locations,
one based on SSDs or similar flash storage, the other on slow HDDs. We
want to be able to alter the storage location of objects on the fly,
typically only from fast to slow storage. The object should then only
reside on slow storage.
The setup is as follows on Nautilus (Ubuntu 16.04, see
<https://gist.github.com/mrngm/bba6ffdc545bfa52ebf79d6d2c002a6d> for the
full dump):
<<<<<<<<
root@node1:~# ceph -s
health: HEALTH_OK
mon: 3 daemons, quorum node1,node3,node5 (age 12d)
mgr: node2(active, since 6d), standbys: node4
osd: 4 osds: 4 up (since 12d), 4 in (since 12d)
rgw: 1 daemon active (node1)
pools: 7 pools, 296 pgs
objects: 229 objects, 192 KiB
usage: 3.2 GiB used, 6.8 GiB / 10 GiB avail
pgs: 296 active+clean
root@node1:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.00970 root default
-16 0.00970 datacenter nijmegen
-3 0.00388 host node2
0 hdd 0.00388 osd.0 up 1.00000 1.00000
-5 0.00388 host node3
1 hdd 0.00388 osd.1 up 1.00000 1.00000
-7 0.00098 host node4
2 ssd 0.00098 osd.2 up 1.00000 1.00000
-9 0.00098 host node5
3 ssd 0.00098 osd.3 up 1.00000 1.00000
root@node1:~# ceph osd pool ls detail
pool 1 'tier1-ssd' replicated size 2 min_size 1 crush_rule 1 object_hash
rjenkins pg_num 128 pgp_num 128 [snip] application rgw
pool 2 'tier2-hdd' replicated size 1 min_size 1 crush_rule 2 object_hash
rjenkins pg_num 128 pgp_num 128 [snip] application rgw
pool 3 '.rgw.root' replicated size 2 min_size 1 crush_rule 0 object_hash
rjenkins pg_num 8 pgp_num 8 [snip] application rgw
pool 4 'default.rgw.control' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 5 'default.rgw.meta' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 6 'default.rgw.log' replicated size 2 min_size 1 crush_rule 0
[snip] application rgw
pool 7 'default.rgw.buckets.index' replicated size 3 min_size 2
crush_rule 0 [snip] application rgw
root@node1:~# ceph osd pool application get # compacted
tier1-ssd => rgw {}
tier2-hdd => rgw {}
.rgw.root => rgw {}
default.rgw.control => rgw {}
default.rgw.meta => rgw {}
default.rgw.log => rgw {}
default.rgw.buckets.index => rgw {}
root@node1:~# radosgw-admin zonegroup placement list
[
{
"key": "default-placement",
"val": {
"name": "default-placement",
"tags": [],
"storage_classes": [
"SPINNING_RUST",
"STANDARD"
]
}
}
]
root@node1:~# radosgw-admin zone placement list
[
{
"key": "default-placement",
"val": {
"index_pool": "default.rgw.buckets.index",
"storage_classes": {
"SPINNING_RUST": {
"data_pool": "tier2-hdd"
},
"STANDARD": {
"data_pool": "tier1-ssd"
}
},
"data_extra_pool": "default.rgw.buckets.non-ec",
"index_type": 0
}
}
]
========
I can also post the relevant s3cmd commands for putting objects and
setting the storage class, but perhaps this is already enough
information. Please let me know.
<<<<<<<<
root@node1:~# rados -p tier1-ssd ls
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_darthvader.png
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_2019-10-15-090436_1254x522_scrubbed.png
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_kanariepiet.jpg
root@node1:~# rados -p tier2-hdd ls
ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1__shadow_.FEruUOZaVJXJcOG-e2tO1xcInNzoEvN_0
$ s3cmd info s3://bucket/kanariepiet.jpg
[snip]
Last mod: Tue, 10 Dec 2019 08:09:58 GMT
Storage: STANDARD
[snip]
$ s3cmd info s3://bucket/darthvader.png
[snip]
Last mod: Wed, 04 Dec 2019 10:35:14 GMT
Storage: SPINNING_RUST
[snip]
$ s3cmd info s3://bucket/2019-10-15-090436_1254x522_scrubbed.png
[snip]
Last mod: Tue, 10 Dec 2019 10:33:24 GMT
Storage: STANDARD
[snip]
==========
Any thoughts on what might occur here?
Best regards,
Gerdriaan Mulder