December 2019 - ceph-users

by Abhishek Lekshmanan

This is the eighth backport release in the Ceph Mimic stable release series. Its sole purpose is to fix a regression that found its way into the previous release. Notable Changes --------------- * Due to a missed backport, clusters in the process of being upgraded from 13.2.6 to 13.2.7 might suffer an OSD crash in build_incremental_map_msg. This regression was reported in https://tracker.ceph.com/issues/43106 and is fixed in 13.2.8 (this release). Users of 13.2.6 can upgrade to 13.2.8 directly - i.e., skip 13.2.7 - to avoid this. Changelog --------- * osd: fix sending incremental map messages (issue#43106 pr#32000, Sage Weil) * tests: added missing point release versions (pr#32087, Yuri Weinstein) * tests: rgw: add missing force-branch: ceph-mimic for swift tasks (pr#32033, Casey Bodley) For a blog with links to PRs and issues please check out https://ceph.io/releases/v13-2-8-mimic-released/ Getting Ceph ------------ * Git at git://github.com/ceph/ceph.git * Tarball at http://download.ceph.com/tarballs/ceph-13.2.8.tar.gz * For packages, see http://docs.ceph.com/docs/master/install/get-packages/ * Release git sha1: 5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0 -- Abhishek Lekshmanan SUSE Software Solutions Germany GmbH

4 years, 4 months

1
0
0 0

Re: RESEND: Re: PG Balancer Upmap mode not working

by David Zafman

Phillippe, Maybe you can try the "crush-compat" balancer mode instead of "upmap" until the new code is released. David On 12/11/19 9:36 PM, Philippe D'Anjou wrote: > > Hi, > I see your code balanced my ssdpool to about 146 each. > I can confirm this did NOT happen. > The state it ended up in is: > > 0 ssd 3.49219 1.00000 3.5 TiB 797 GiB 757 GiB 36 GiB 3.9 GiB 2.7 > TiB 22.30 0.31 147 up > 1 ssd 3.49219 1.00000 3.5 TiB 803 GiB 751 GiB 49 GiB 3.7 GiB 2.7 > TiB 22.47 0.31 146 up > 2 ssd 3.49219 1.00000 3.5 TiB 818 GiB 764 GiB 51 GiB 3.8 GiB 2.7 > TiB 22.89 0.32 150 up > 3 ssd 3.49219 1.00000 3.5 TiB 794 GiB 757 GiB 34 GiB 3.2 GiB 2.7 > TiB 22.21 0.31 146 up > 4 ssd 3.49219 1.00000 3.5 TiB 837 GiB 798 GiB 34 GiB 4.4 GiB 2.7 > TiB 23.39 0.32 156 up > 6 ssd 3.49219 1.00000 3.5 TiB 790 GiB 751 GiB 35 GiB 3.6 GiB 2.7 > TiB 22.09 0.31 146 up > 8 ssd 3.49219 1.00000 3.5 TiB 874 GiB 831 GiB 40 GiB 3.5 GiB 2.6 > TiB 24.44 0.34 156 up > 10 ssd 3.49219 1.00000 3.5 TiB 807 GiB 761 GiB 43 GiB 3.4 GiB 2.7 > TiB 22.58 0.31 146 up > 5 ssd 3.49219 1.00000 3.5 TiB 744 GiB 708 GiB 32 GiB 4.2 GiB 2.8 > TiB 20.81 0.29 141 up > 7 ssd 3.49219 1.00000 3.5 TiB 732 GiB 690 GiB 39 GiB 3.2 GiB 2.8 > TiB 20.48 0.28 136 up > 9 ssd 3.49219 1.00000 3.5 TiB 702 GiB 657 GiB 42 GiB 3.9 GiB 2.8 > TiB 19.64 0.27 131 up > 11 ssd 3.49219 1.00000 3.5 TiB 805 GiB 781 GiB 22 GiB 2.3 GiB 2.7 > TiB 22.50 0.31 138 up > 101 ssd 3.49219 1.00000 3.5 TiB 835 GiB 793 GiB 38 GiB 3.7 GiB 2.7 > TiB 23.36 0.32 146 up > 103 ssd 3.49219 1.00000 3.5 TiB 846 GiB 803 GiB 40 GiB 3.3 GiB 2.7 > TiB 23.67 0.33 150 up > 105 ssd 3.49219 1.00000 3.5 TiB 800 GiB 762 GiB 36 GiB 2.5 GiB 2.7 > TiB 22.38 0.31 148 up > 107 ssd 3.49219 1.00000 3.5 TiB 843 GiB 790 GiB 49 GiB 3.4 GiB 2.7 > TiB 23.58 0.33 147 up > 100 ssd 3.49219 1.00000 3.5 TiB 804 GiB 753 GiB 48 GiB 2.6 GiB 2.7 > TiB 22.47 0.31 144 up > 102 ssd 3.49219 1.00000 3.5 TiB 752 GiB 737 GiB 13 GiB 2.4 GiB 2.8 > TiB 21.02 0.29 141 up > 104 ssd 3.49219 1.00000 3.5 TiB 805 GiB 771 GiB 31 GiB 2.8 GiB 2.7 > TiB 22.50 0.31 144 up > 106 ssd 3.49219 1.00000 3.5 TiB 793 GiB 724 GiB 66 GiB 2.9 GiB 2.7 > TiB 22.17 0.31 143 up > 108 ssd 3.49219 1.00000 3.5 TiB 816 GiB 778 GiB 36 GiB 2.7 GiB 2.7 > TiB 22.83 0.32 156 up > 109 ssd 3.49219 1.00000 3.5 TiB 811 GiB 763 GiB 45 GiB 2.8 GiB 2.7 > TiB 22.68 0.31 146 up > 110 ssd 3.49219 1.00000 3.5 TiB 863 GiB 832 GiB 28 GiB 2.5 GiB 2.6 > TiB 24.13 0.33 154 up > 111 ssd 3.49219 1.00000 3.5 TiB 784 GiB 737 GiB 45 GiB 2.7 GiB 2.7 > TiB 21.92 0.30 146 up > > > It did not try to balance any further. Someone said he had the same issue. > I am pretty sure it will also not balance out the HDDs as neatly as > you got it there. There is definitely an issue somewhere, so far 3 > people telling the same story. I never had this issue under Luminous > but im fighting with it since 4 months on 2 clusters. One got upgraded > to Nautilus and the other one (the one the pastes are from) is a fresh > 14.2.4 one. > > Any ideas on that? > > Thanks > Am Donnerstag, 12. Dezember 2019, 02:09:33 OEZ hat David Zafman > <dzafman(a)redhat.com> Folgendes geschrieben: > > > > Philippe, > > I have a master branch version of the code to test. The nautilus > backport https://github.com/ceph/ceph/pull/31956 > <https://github.com/ceph/ceph/pull/31956 >should be the same. > > Using your OSDMap, the code in master branch and some additional changes > to osdmaptool I was able to balance your cluster. The osdmaptool > changes simulate the mgr active balancer behavior. It never took no > more than 0.13991 seconds to calculate more upmaps per round. And that's > on a virtual machine used for development. It took 35 rounds with 10 > maximum upmaps per crush rule set of pools per round. With the default > 1 minute sleeps inside the mgr it would take 35 minutes. Obviously, > recovery/backfill has to finish before the cluster settles into the new > configuration. It needed 397 additional upmaps and removed 8. > > Because all pools for a given crush rule are balanced together you can > see that this is more balanced than Rich's configuration uising Luminous. > > This balancer code is subject to change before final release of the next > Nautilus point release. > > > Final layout: > > osd.0 pgs 146 > osd.1 pgs 146 > osd.2 pgs 146 > osd.3 pgs 146 > osd.4 pgs 146 > osd.5 pgs 146 > osd.6 pgs 146 > osd.7 pgs 146 > osd.8 pgs 146 > osd.9 pgs 146 > osd.10 pgs 146 > osd.11 pgs 146 > osd.12 pgs 74 > osd.13 pgs 74 > osd.14 pgs 73 > osd.15 pgs 74 > osd.16 pgs 74 > osd.17 pgs 74 > osd.18 pgs 73 > osd.19 pgs 74 > osd.20 pgs 73 > osd.21 pgs 73 > osd.22 pgs 74 > osd.23 pgs 73 > osd.24 pgs 73 > osd.25 pgs 75 > osd.26 pgs 74 > osd.27 pgs 74 > osd.28 pgs 73 > osd.29 pgs 73 > osd.30 pgs 73 > osd.31 pgs 73 > osd.32 pgs 74 > osd.33 pgs 73 > osd.34 pgs 73 > osd.35 pgs 74 > osd.36 pgs 74 > osd.37 pgs 74 > osd.38 pgs 74 > osd.39 pgs 74 > osd.40 pgs 73 > osd.41 pgs 73 > osd.42 pgs 73 > osd.43 pgs 73 > osd.44 pgs 74 > osd.45 pgs 73 > osd.46 pgs 73 > osd.47 pgs 73 > osd.48 pgs 73 > osd.49 pgs 73 > osd.50 pgs 73 > osd.51 pgs 73 > osd.52 pgs 75 > osd.53 pgs 59 > osd.54 pgs 74 > osd.55 pgs 74 > osd.56 pgs 74 > osd.57 pgs 73 > osd.58 pgs 74 > osd.59 pgs 74 > osd.60 pgs 74 > osd.61 pgs 74 > osd.62 pgs 73 > osd.63 pgs 74 > osd.64 pgs 73 > osd.65 pgs 74 > osd.66 pgs 74 > osd.67 pgs 74 > osd.68 pgs 73 > osd.69 pgs 74 > osd.70 pgs 73 > osd.71 pgs 73 > osd.72 pgs 73 > osd.73 pgs 73 > osd.74 pgs 73 > osd.75 pgs 73 > osd.76 pgs 73 > osd.77 pgs 73 > osd.78 pgs 73 > > osd.79 pgs 73 > osd.80 pgs 73 > osd.81 pgs 73 > osd.82 pgs 73 > osd.83 pgs 73 > osd.84 pgs 73 > osd.85 pgs 73 > osd.86 pgs 73 > osd.87 pgs 73 > osd.88 pgs 73 > osd.89 pgs 73 > osd.90 pgs 73 > osd.91 pgs 73 > osd.92 pgs 73 > osd.93 pgs 73 > osd.94 pgs 73 > osd.95 pgs 73 > osd.96 pgs 73 > osd.97 pgs 73 > osd.98 pgs 73 > osd.99 pgs 73 > osd.100 pgs 146 > osd.101 pgs 146 > osd.102 pgs 146 > osd.103 pgs 146 > osd.104 pgs 146 > osd.105 pgs 146 > osd.106 pgs 146 > osd.107 pgs 146 > osd.108 pgs 146 > osd.109 pgs 146 > osd.110 pgs 146 > osd.111 pgs 146 > osd.112 pgs 73 > osd.113 pgs 73 > osd.114 pgs 73 > osd.115 pgs 73 > osd.116 pgs 73 > osd.117 pgs 73 > osd.118 pgs 73 > osd.119 pgs 73 > osd.120 pgs 73 > osd.121 pgs 73 > osd.122 pgs 73 > osd.123 pgs 73 > osd.124 pgs 73 > osd.125 pgs 73 > osd.126 pgs 73 > osd.127 pgs 74 > osd.128 pgs 73 > osd.129 pgs 73 > osd.130 pgs 73 > osd.131 pgs 73 > osd.132 pgs 73 > osd.133 pgs 73 > osd.134 pgs 73 > osd.135 pgs 73 > > > David > > On 12/10/19 9:59 PM, Philippe D'Anjou wrote: > > Given I was told its an issue of too low PGs I am raising and testing > > this, although my SSDs which have about 150 each also are not well > > distributed. > > I attached my OSDmap, I'd appreciate if you could run your test on it > > like you did with the other guy, so I know if this will ever > > distribute equally or not.. > > > > If you're busy I understand that too, then ignore this. > > > > Thanks in either case. Just have been dealing with this since months > > now and it gets frustrating. > > > > best regards > > > > Am Dienstag, 10. Dezember 2019, 03:53:17 OEZ hat David Zafman > > <dzafman(a)redhat.com <mailto:dzafman@redhat.com>> Folgendes geschrieben: > > > > > > > > Please file a tracker with the symptom and examples. Please attach your > > OSDMap (ceph osd getmap > osdmap.bin). > > > > Note that https://github.com/ceph/ceph/pull/31956 > <https://github.com/ceph/ceph/pull/31956 > > > <https://github.com/ceph/ceph/pull/31956 > <https://github.com/ceph/ceph/pull/31956 >>has the Nautilus > > version of improved upmap code. It also changes osdmaptool to match the > > mgr behavior, so that one can observe the behavior of the upmap balancer > > offline. > > > > Thanks > > > > David > > > > On 12/8/19 11:04 AM, Philippe D'Anjou wrote: > > > It's only getting worse after raising PGs now. > > > > > > Anything between: > > > 96 hdd 9.09470 1.00000 9.1 TiB 4.9 TiB 4.9 TiB 97 KiB 13 GiB 4.2 > > > TiB 53.62 0.76 54 up > > > > > > and > > > > > > 89 hdd 9.09470 1.00000 9.1 TiB 8.1 TiB 8.1 TiB 88 KiB 21 GiB 1001 > > > GiB 89.25 1.27 87 up > > > > > > How is that possible? I dont know how much more proof I need to > > > present that there's a bug. > > > > > > > > > > > > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users(a)lists.ceph.com <mailto:ceph-users@lists.ceph.com> > <mailto:ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>> > > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >

4 years, 4 months

1
0
0 0

Can't create new OSD

by Rodrigo Severo - Fábrica

Hi, Trying to create a new OSD following the instructions available at https://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/ On step 3 I'm instructed to run "ceph-osd -i {osd-num} --mkfs --mkkey". Unfortunately it doesn't work: # ceph-osd -i 3 --mkfs --mkkey 2019-12-11 16:59:58.257 7fac4597fc00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-3/keyring: (2) No such file or directory 2019-12-11 16:59:58.257 7fac4597fc00 -1 AuthRegistry(0x55ad976ea140) no keyring found at /var/lib/ceph/osd/ceph-3/keyring, disabling cephx 2019-12-11 16:59:58.261 7fac4597fc00 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-3/keyring: (2) No such file or directory 2019-12-11 16:59:58.261 7fac4597fc00 -1 AuthRegistry(0x7fffac4075e8) no keyring found at /var/lib/ceph/osd/ceph-3/keyring, disabling cephx failed to fetch mon config (--no-mon-config to skip) Shouldn't it create the keyring? Why is it complaining about not being able to find a keyring? Regards, Rodrigo

4 years, 4 months

2
2
0 0

Ceph User Survey 2019

by Mike Perez

Hi Cephers, To better understand how our current users utilize Ceph, we conducted a public community survey. This information is a guide to the community of how we spend our contribution efforts for future development. The survey results will remain anonymous and aggregated in future Ceph Foundation publications to the community. I'm pleased to announce after much discussion on the Ceph dev mailing list [0] that the community has formed the Ceph Survey for 2019. The deadline for this survey due to it being out later than we'd like will be January 31st, 2020 at 11:59 PT. https://ceph.io/user-survey/ We have discussed in the future to use the Ceph telemetry module to collect the data to save time for our users. Please let me know of any mistakes that need to be corrected on the survey. Thanks! [0] - https://lists.ceph.io/hyperkitty/list/dev@ceph.io/thread/WU374ZJP5N3NKY22X2… -- Mike Perez he/him Ceph Community Manager M: +1-951-572-2633 494C 5D25 2968 D361 65FB 3829 94BC D781 ADA8 8AEA @Thingee <https://twitter.com/thingee> Thingee <https://www.linkedin.com/thingee> <https://www.facebook.com/RedHatInc> <https://www.redhat.com>

4 years, 4 months

4
3
0 0

Re: RESEND: Re: PG Balancer Upmap mode not working

by David Zafman

Philippe, I have a master branch version of the code to test. The nautilus backport https://github.com/ceph/ceph/pull/31956 should be the same. Using your OSDMap, the code in master branch and some additional changes to osdmaptool I was able to balance your cluster. The osdmaptool changes simulate the mgr active balancer behavior. It never took no more than 0.13991 seconds to calculate more upmaps per round. And that's on a virtual machine used for development. It took 35 rounds with 10 maximum upmaps per crush rule set of pools per round. With the default 1 minute sleeps inside the mgr it would take 35 minutes. Obviously, recovery/backfill has to finish before the cluster settles into the new configuration. It needed 397 additional upmaps and removed 8. Because all pools for a given crush rule are balanced together you can see that this is more balanced than Rich's configuration uising Luminous. This balancer code is subject to change before final release of the next Nautilus point release. Final layout: osd.0 pgs 146 osd.1 pgs 146 osd.2 pgs 146 osd.3 pgs 146 osd.4 pgs 146 osd.5 pgs 146 osd.6 pgs 146 osd.7 pgs 146 osd.8 pgs 146 osd.9 pgs 146 osd.10 pgs 146 osd.11 pgs 146 osd.12 pgs 74 osd.13 pgs 74 osd.14 pgs 73 osd.15 pgs 74 osd.16 pgs 74 osd.17 pgs 74 osd.18 pgs 73 osd.19 pgs 74 osd.20 pgs 73 osd.21 pgs 73 osd.22 pgs 74 osd.23 pgs 73 osd.24 pgs 73 osd.25 pgs 75 osd.26 pgs 74 osd.27 pgs 74 osd.28 pgs 73 osd.29 pgs 73 osd.30 pgs 73 osd.31 pgs 73 osd.32 pgs 74 osd.33 pgs 73 osd.34 pgs 73 osd.35 pgs 74 osd.36 pgs 74 osd.37 pgs 74 osd.38 pgs 74 osd.39 pgs 74 osd.40 pgs 73 osd.41 pgs 73 osd.42 pgs 73 osd.43 pgs 73 osd.44 pgs 74 osd.45 pgs 73 osd.46 pgs 73 osd.47 pgs 73 osd.48 pgs 73 osd.49 pgs 73 osd.50 pgs 73 osd.51 pgs 73 osd.52 pgs 75 osd.53 pgs 59 osd.54 pgs 74 osd.55 pgs 74 osd.56 pgs 74 osd.57 pgs 73 osd.58 pgs 74 osd.59 pgs 74 osd.60 pgs 74 osd.61 pgs 74 osd.62 pgs 73 osd.63 pgs 74 osd.64 pgs 73 osd.65 pgs 74 osd.66 pgs 74 osd.67 pgs 74 osd.68 pgs 73 osd.69 pgs 74 osd.70 pgs 73 osd.71 pgs 73 osd.72 pgs 73 osd.73 pgs 73 osd.74 pgs 73 osd.75 pgs 73 osd.76 pgs 73 osd.77 pgs 73 osd.78 pgs 73 osd.79 pgs 73 osd.80 pgs 73 osd.81 pgs 73 osd.82 pgs 73 osd.83 pgs 73 osd.84 pgs 73 osd.85 pgs 73 osd.86 pgs 73 osd.87 pgs 73 osd.88 pgs 73 osd.89 pgs 73 osd.90 pgs 73 osd.91 pgs 73 osd.92 pgs 73 osd.93 pgs 73 osd.94 pgs 73 osd.95 pgs 73 osd.96 pgs 73 osd.97 pgs 73 osd.98 pgs 73 osd.99 pgs 73 osd.100 pgs 146 osd.101 pgs 146 osd.102 pgs 146 osd.103 pgs 146 osd.104 pgs 146 osd.105 pgs 146 osd.106 pgs 146 osd.107 pgs 146 osd.108 pgs 146 osd.109 pgs 146 osd.110 pgs 146 osd.111 pgs 146 osd.112 pgs 73 osd.113 pgs 73 osd.114 pgs 73 osd.115 pgs 73 osd.116 pgs 73 osd.117 pgs 73 osd.118 pgs 73 osd.119 pgs 73 osd.120 pgs 73 osd.121 pgs 73 osd.122 pgs 73 osd.123 pgs 73 osd.124 pgs 73 osd.125 pgs 73 osd.126 pgs 73 osd.127 pgs 74 osd.128 pgs 73 osd.129 pgs 73 osd.130 pgs 73 osd.131 pgs 73 osd.132 pgs 73 osd.133 pgs 73 osd.134 pgs 73 osd.135 pgs 73 David On 12/10/19 9:59 PM, Philippe D'Anjou wrote: > Given I was told its an issue of too low PGs I am raising and testing > this, although my SSDs which have about 150 each also are not well > distributed. > I attached my OSDmap, I'd appreciate if you could run your test on it > like you did with the other guy, so I know if this will ever > distribute equally or not.. > > If you're busy I understand that too, then ignore this. > > Thanks in either case. Just have been dealing with this since months > now and it gets frustrating. > > best regards > > Am Dienstag, 10. Dezember 2019, 03:53:17 OEZ hat David Zafman > <dzafman(a)redhat.com> Folgendes geschrieben: > > > > Please file a tracker with the symptom and examples. Please attach your > OSDMap (ceph osd getmap > osdmap.bin). > > Note that https://github.com/ceph/ceph/pull/31956 > <https://github.com/ceph/ceph/pull/31956 >has the Nautilus > version of improved upmap code. It also changes osdmaptool to match the > mgr behavior, so that one can observe the behavior of the upmap balancer > offline. > > Thanks > > David > > On 12/8/19 11:04 AM, Philippe D'Anjou wrote: > > It's only getting worse after raising PGs now. > > > > Anything between: > > 96 hdd 9.09470 1.00000 9.1 TiB 4.9 TiB 4.9 TiB 97 KiB 13 GiB 4.2 > > TiB 53.62 0.76 54 up > > > > and > > > > 89 hdd 9.09470 1.00000 9.1 TiB 8.1 TiB 8.1 TiB 88 KiB 21 GiB 1001 > > GiB 89.25 1.27 87 up > > > > How is that possible? I dont know how much more proof I need to > > present that there's a bug. > > > > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users(a)lists.ceph.com <mailto:ceph-users@lists.ceph.com> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >

4 years, 4 months

1
0
0 0

Size and capacity calculations questions

by Jochen Schulz

Hi! We have a ceph cluster with 42 OSD in production as a server providing mainly home-directories of users. Ceph is 14.2.4 nautilus. We have 3 pools. One images (for rbd images) a cephfs_metadata and a cephfs_data pool. Our raw data is about 5.6T. All pools have replica size 3 and there are only very little snapshots in the rbd images pool, the cephfspool doesnt use snapshots. How is it possible that the status tells us, that 21T/46T is used, because thats much more than 3 times the raw size. Also, to make that more confusing, there as at least half of the cluster free, and we get pg backfill_toofull after we added some OSDs lately. The ceph dashboard tells aus the pool ist 82 % full and has only 4.5 T free. The autoscale module seems to calculate the 20T times 3 for the space needed and thus has wrong numbers (see below). Status of the cluster is added below too. how can these size/capacity numbers be explained? and, would be there a recommendation to change something? Thank you in advance! best Jochen # ceph -s cluster: id: 2b16167f-3f33-4580-a0e9-7a71978f403d health: HEALTH_ERR Degraded data redundancy (low space): 1 pg backfill_toofull 1 subtrees have overcommitted pool target_size_bytes 1 subtrees have overcommitted pool target_size_ratio 2 pools have too many placement groups services: mon: 4 daemons, quorum jade,assam,matcha,jasmine (age 2d) mgr: earl(active, since 24h), standbys: assam mds: cephfs:1 {0=assam=up:active} 1 up:standby osd: 42 osds: 42 up (since 106m), 42 in (since 115m); 30 remapped pgs data: pools: 3 pools, 2048 pgs objects: 29.80M objects, 5.6 TiB usage: 21 TiB used, 25 TiB / 46 TiB avail pgs: 1164396/89411013 objects misplaced (1.302%) 2018 active+clean 22 active+remapped+backfill_wait 7 active+remapped+backfilling 1 active+remapped+backfill_wait+backfill_toofull io: client: 1.7 KiB/s rd, 516 KiB/s wr, 0 op/s rd, 28 op/s wr recovery: 9.2 MiB/s, 41 objects/s # ceph osd pool autoscale-status POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE images 354.2G 3.0 46100G 0.0231 1.0 1024 32 warn cephfs_metadata 13260M 3.0 595.7G 0.0652 1.0 512 8 warn cephfs_data 20802G 3.0 46100G 1.3537 1.0 512 warn

4 years, 4 months

5
9
1 0

getfattr problem on ceph-fs

by Frank Schilder

I have a strange problem with ceph fs and extended attributes. I have two Centos machines where I mount cephfs in exactly the same way (I manually executed the exact same mount command on both machines). On one of the machines, getfattr returns this: [root@ceph-01 ~]# getfattr -d -m 'ceph.*' /mnt/cephfs/hpc/home getfattr: Removing leading '/' from absolute path names # file: mnt/cephfs/hpc/home ceph.dir.entries="49" ceph.dir.files="1" ceph.dir.rbytes="77816237666910" ceph.dir.rctime="1575978038.0976848840" ceph.dir.rentries="6673312" ceph.dir.rfiles="6271408" ceph.dir.rsubdirs="401904" ceph.dir.subdirs="48" and on the other I get nothing: [root@gnosis ~]# getfattr -d -m 'ceph.*' /mnt/cephfs/hpc/home No error message, just nothing. The only difference is, that ceph-01 was kickstarted with Centos7.6 while gnosis was kickstarted with Centos7.7. Otherwise, both machines are deployed identically. getfattr is the same version on both. Kernel versions are ceph-01:5.0.2-1.el7.elrepo.x86_64 and gnosis:5.4.2-1.el7.elrepo.x86_64. Does anyone have a pointer what to look for? Thanks! ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14

4 years, 4 months

3
4
0 0

Shouldn't Ceph's documentation be "per version"?

by Rodrigo Severo - Fábrica

Hi, Shouldn't Ceph's documentation be presented "per version"? I believe there might be documentation for Ceph per version but I can't see in Ceph documentation site how to easily see each version's docs. Regards, Rodrigo Severo

4 years, 4 months

2
1
0 0

Cephalocon 2020

by Sage Weil

Hi everyone, The next Cephalocon is coming up on March 3-5 in Seoul! The CFP is open until Friday (get your talks in!). We expect to have the program ready for the first week of January. Registration (early bird) will be available soon. We're also looking for sponsors for the conference. The prospectus is available here: https://ceph.io/wp-content/uploads/2019/12/sponsor-Cephalocon20-112719.pdf Thanks! sage

4 years, 4 months

1
1
0 0

[object gateway] setting storage class does not move object to correct backing pool?

by Gerdriaan Mulder

Hi, If I change the storage class of an object via s3cmd, the object's storage class is reported as being changed. However, when inspecting where the objects are placed (via `rados -p <pool> ls`, see further on), the object seems to be retained in the original pool. The idea behind this test setup is to simulate two storage locations, one based on SSDs or similar flash storage, the other on slow HDDs. We want to be able to alter the storage location of objects on the fly, typically only from fast to slow storage. The object should then only reside on slow storage. The setup is as follows on Nautilus (Ubuntu 16.04, see <https://gist.github.com/mrngm/bba6ffdc545bfa52ebf79d6d2c002a6d> for the full dump): <<<<<<<< root@node1:~# ceph -s health: HEALTH_OK mon: 3 daemons, quorum node1,node3,node5 (age 12d) mgr: node2(active, since 6d), standbys: node4 osd: 4 osds: 4 up (since 12d), 4 in (since 12d) rgw: 1 daemon active (node1) pools: 7 pools, 296 pgs objects: 229 objects, 192 KiB usage: 3.2 GiB used, 6.8 GiB / 10 GiB avail pgs: 296 active+clean root@node1:~# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.00970 root default -16 0.00970 datacenter nijmegen -3 0.00388 host node2 0 hdd 0.00388 osd.0 up 1.00000 1.00000 -5 0.00388 host node3 1 hdd 0.00388 osd.1 up 1.00000 1.00000 -7 0.00098 host node4 2 ssd 0.00098 osd.2 up 1.00000 1.00000 -9 0.00098 host node5 3 ssd 0.00098 osd.3 up 1.00000 1.00000 root@node1:~# ceph osd pool ls detail pool 1 'tier1-ssd' replicated size 2 min_size 1 crush_rule 1 object_hash rjenkins pg_num 128 pgp_num 128 [snip] application rgw pool 2 'tier2-hdd' replicated size 1 min_size 1 crush_rule 2 object_hash rjenkins pg_num 128 pgp_num 128 [snip] application rgw pool 3 '.rgw.root' replicated size 2 min_size 1 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 [snip] application rgw pool 4 'default.rgw.control' replicated size 2 min_size 1 crush_rule 0 [snip] application rgw pool 5 'default.rgw.meta' replicated size 2 min_size 1 crush_rule 0 [snip] application rgw pool 6 'default.rgw.log' replicated size 2 min_size 1 crush_rule 0 [snip] application rgw pool 7 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule 0 [snip] application rgw root@node1:~# ceph osd pool application get # compacted tier1-ssd => rgw {} tier2-hdd => rgw {} .rgw.root => rgw {} default.rgw.control => rgw {} default.rgw.meta => rgw {} default.rgw.log => rgw {} default.rgw.buckets.index => rgw {} root@node1:~# radosgw-admin zonegroup placement list [ { "key": "default-placement", "val": { "name": "default-placement", "tags": [], "storage_classes": [ "SPINNING_RUST", "STANDARD" ] } } ] root@node1:~# radosgw-admin zone placement list [ { "key": "default-placement", "val": { "index_pool": "default.rgw.buckets.index", "storage_classes": { "SPINNING_RUST": { "data_pool": "tier2-hdd" }, "STANDARD": { "data_pool": "tier1-ssd" } }, "data_extra_pool": "default.rgw.buckets.non-ec", "index_type": 0 } } ] ======== I can also post the relevant s3cmd commands for putting objects and setting the storage class, but perhaps this is already enough information. Please let me know. <<<<<<<< root@node1:~# rados -p tier1-ssd ls ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_darthvader.png ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_2019-10-15-090436_1254x522_scrubbed.png ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1_kanariepiet.jpg root@node1:~# rados -p tier2-hdd ls ce2fc9ee-edc8-4dc7-a3fe-b1458c67168b.5805.1__shadow_.FEruUOZaVJXJcOG-e2tO1xcInNzoEvN_0 $ s3cmd info s3://bucket/kanariepiet.jpg [snip] Last mod: Tue, 10 Dec 2019 08:09:58 GMT Storage: STANDARD [snip] $ s3cmd info s3://bucket/darthvader.png [snip] Last mod: Wed, 04 Dec 2019 10:35:14 GMT Storage: SPINNING_RUST [snip] $ s3cmd info s3://bucket/2019-10-15-090436_1254x522_scrubbed.png [snip] Last mod: Tue, 10 Dec 2019 10:33:24 GMT Storage: STANDARD [snip] ========== Any thoughts on what might occur here? Best regards, Gerdriaan Mulder

4 years, 4 months

3
4
0 0

2024

2023

2022

2021

2020

2019

ceph-users December 2019