On Thu, Apr 22, 2021 at 9:24 PM Cem Zafer <cemzafer(a)gmail.com> wrote:
>
> Sorry to disturb you again but changing the value to yes doesnt affect anything. Executing simple ceph command from the client replies the following error, again. I'm not so sure it is related with that parameter.
> Have you any idea what could cause the problem?
>
> indiana@mars:~$ ceph -s
> 2021-04-22T22:19:51.305+0300 7f20ea249700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
> [errno 13] RADOS permission denied (error connecting to the cluster)
This looks like a different host/client. What version of ceph-common
is installed on mars (or just run "ceph -v" instead of "ceph -s")?
Thanks,
Ilya
Hello.
I have a rgw s3 user and the user have 2 bucket.
I tried to copy objects from old.bucket to new.bucket with rclone. (in
the rgw client server)
After I checked the object with "radosgw-admin --bucket=new.bucket
object stat $i" and I saw old.bucket id and marker id also old bucket
name in the object stats.
Is rgw doing this for deduplication or is it a bug?
If it's not a bug then If I delete the old bucket what will happen to
these objects???
Hi,
I have a customer VM that is running fine, but I can not make snapshots
anymore.
rbd snap create rbd/IMAGE@test-bb-1
just hangs forever.
When I checked the status with
rbd status rbd/IMAGE
it shows one watcher, the cpu node where the VM is running.
What can I do to investigate further, without restarting the VM.
This is the only affected VM and it stopped working three days ago.
On Thu, Apr 22, 2021 at 6:01 PM Cem Zafer <cemzafer(a)gmail.com> wrote:
>
> Thanks Ilya, pointing me out to the right direction.
> So if I change the auth_allow_insecure_global_id_reclaim to true means older userspace clients allowed to the cluster?
Yes, but upgrading all clients and setting it to false is recommended.
Thanks,
Ilya
On Thu, Apr 22, 2021 at 5:04 PM Cem Zafer <cemzafer(a)gmail.com> wrote:
>
> Hi Ilya,
> Yes you are correct, I have set auth_allow_insecure_global_id_reclaim to false.
> Host ceph-common package version is 16.2.0 and the cluster ceph -v output is as follows.
>
> root@ceph100:~# ceph -v
> ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific (stable)
> Regards.
Right, so because you set auth_allow_insecure_global_id_reclaim to false,
older userspace clients, in this case 16.2.0, are not allowed to connect
because they won't reclaim their global_id in a secure fashion. See
https://docs.ceph.com/en/latest/security/CVE-2021-20288/
for details.
Thanks,
Ilya
>
> On Thu, Apr 22, 2021 at 4:49 PM Ilya Dryomov <idryomov(a)gmail.com> wrote:
>>
>> On Thu, Apr 22, 2021 at 3:24 PM Cem Zafer <cemzafer(a)gmail.com> wrote:
>> >
>> > Hi,
>> > I have recently add a new host to ceph and copied /etc/ceph directory to
>> > the new host. When I execute the simple ceph command as "ceph -s", get the
>> > following error.
>> >
>> > 021-04-22T14:50:46.226+0300 7ff541141700 -1 monclient(hunting):
>> > handle_auth_bad_method server allowed_methods [2] but i only support [2]
>> > 2021-04-22T14:50:46.226+0300 7ff540940700 -1 monclient(hunting):
>> > handle_auth_bad_method server allowed_methods [2] but i only support [2]
>> > 2021-04-22T14:50:46.226+0300 7ff533fff700 -1 monclient(hunting):
>> > handle_auth_bad_method server allowed_methods [2] but i only support [2]
>> > [errno 13] RADOS permission denied (error connecting to the cluster)
>> >
>> > When I looked at the syslog on the ceph cluster node, I saw that message
>> > too.
>> >
>> > Apr 22 14:51:40 ceph100 bash[27979]: debug 2021-04-22T11:51:40.684+0000
>> > 7fe4d28cb700 0 cephx server client.admin: attempt to reclaim global_id
>> > 264198 without presenting ticket
>> > Apr 22 14:51:40 ceph100 bash[27979]: debug 2021-04-22T11:51:40.684+0000
>> > 7fe4d28cb700 0 cephx server client.admin: could not verify old ticket
>> >
>> > Anyone can help me out or assist to the right direction or link?
>>
>> Hi Cem,
>>
>> I take it that you upgraded to one of 14.2.20, 15.2.11 or 16.2.1
>> releases and then set auth_allow_insecure_global_id_reclaim to false?
>>
>> What version of ceph-common package is installed on that host? What is
>> the output of "ceph -v"?
>>
>> Thanks,
>>
>> Ilya
Hi,
I have recently add a new host to ceph and copied /etc/ceph directory to
the new host. When I execute the simple ceph command as "ceph -s", get the
following error.
021-04-22T14:50:46.226+0300 7ff541141700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [2] but i only support [2]
2021-04-22T14:50:46.226+0300 7ff540940700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [2] but i only support [2]
2021-04-22T14:50:46.226+0300 7ff533fff700 -1 monclient(hunting):
handle_auth_bad_method server allowed_methods [2] but i only support [2]
[errno 13] RADOS permission denied (error connecting to the cluster)
When I looked at the syslog on the ceph cluster node, I saw that message
too.
Apr 22 14:51:40 ceph100 bash[27979]: debug 2021-04-22T11:51:40.684+0000
7fe4d28cb700 0 cephx server client.admin: attempt to reclaim global_id
264198 without presenting ticket
Apr 22 14:51:40 ceph100 bash[27979]: debug 2021-04-22T11:51:40.684+0000
7fe4d28cb700 0 cephx server client.admin: could not verify old ticket
Anyone can help me out or assist to the right direction or link?
Regards.
Hi,
I upgraded a test cluster to 15.2.11 and after the upgrade has been
finished the cluster status was in HEALTH_WARN because of:
AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure
global_id reclaim
According to the documentation at
https://docs.ceph.com/en/octopus/rados/operations/health-checks/#auth-insec…
I ran
ceph config set mon auth_allow_insecure_global_id_reclaim false
because no client was reporting using insecure global_ids.
Now "ceph -s" cannot connect to the cluster any more:
# ceph -s
[errno 13] RADOS permission denied (error connecting to the cluster)
What should I do?
Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin
http://www.heinlein-support.de
Tel: 030 / 405051-43
Fax: 030 / 405051-19
Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
Hi Igor,
After updating to 14.2.19 and then moving some PGs around we have a
few warnings related to the new efficient PG removal code, e.g. [1].
Is that something to worry about?
Best Regards,
Dan
[1]
/var/log/ceph/ceph-osd.792.log:2021-04-14 20:34:34.353 7fb2439d4700 0
osd.792 pg_epoch: 40906 pg[10.14b2s0( v 40734'290069
(33782'287000,40734'290069] lb MIN (bitwise) local-lis/les=33990/33991
n=36272 ec=4951/4937 lis/c 33990/33716 les/c/f 33991/33747/0
40813/40813/37166) [933,626,260,804,503,491]p933(0) r=-1 lpr=40813
DELETING pi=[33716,40813)/4 crt=40734'290069 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[0#10:4d280000::::head#]
/var/log/ceph/ceph-osd.851.log:2021-04-14 18:40:13.312 7fd87bded700 0
osd.851 pg_epoch: 40671 pg[10.133fs5( v 40662'288967
(33782'285900,40662'288967] lb MIN (bitwise) local-lis/les=33786/33787
n=13 ec=4947/4937 lis/c 40498/33714 les/c/f 40499/33747/0
40670/40670/33432) [859,199,913,329,439,79]p859(0) r=-1 lpr=40670
DELETING pi=[33714,40670)/4 crt=40662'288967 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[5#10:fcc80000::::head#]
/var/log/ceph/ceph-osd.851.log:2021-04-14 20:58:14.393 7fd87adeb700 0
osd.851 pg_epoch: 40906 pg[10.2e8s3( v 40610'288991
(33782'285900,40610'288991] lb MIN (bitwise) local-lis/les=33786/33787
n=161220 ec=4937/4937 lis/c 39826/33716 les/c/f 39827/33747/0
40617/40617/39225) [717,933,727,792,607,129]p717(0) r=-1 lpr=40617
DELETING pi=[33716,40617)/3 crt=40610'288991 unknown NOTIFY mbc={}]
_delete_some additional unexpected onode list (new onodes has appeared
since PG removal started[3#10:17400000::::head#]
/var/log/ceph/ceph-osd.883.log:2021-04-14 18:55:16.822 7f78c485d700 0
osd.883 pg_epoch: 40857 pg[7.d4( v 40804'9911289
(35835'9908201,40804'9911289] lb MIN (bitwise)
local-lis/les=40782/40783 n=195 ec=2063/1989 lis/c 40782/40782 les/c/f
40783/40844/0 40781/40845/40845) [877,870,894] r=-1 lpr=40845 DELETING
pi=[40782,40845)/1 crt=40804'9911289 lcod 40804'9911288 unknown NOTIFY
mbc={}] _delete_some additional unexpected onode list (new onodes has
appeared since PG removal started[#7:2b000000::::head#]
Hey all,
I wanted to confirm my understanding of some of the mechanics of
backfill in EC pools. I've yet to find a document that outlines this
in detail; if there is one, please send it my way. :) Some of what I
write below is likely in the "well, duh" category, but I tended
towards completeness.
First off, I understand that backfill reservations work the same way
between replicated pools and EC pools. A local reservation is taken on
the primary OSD, then a remote reservation on the backfill target(s),
before the backfill is allowed to begin. Until this point, the
backfill is in the backfill_wait state.
When the backfill begins, though, is when the differences begin. Let's
say we have an EC 3:2 PG that's backfilling from OSD 2 to OSD 5
(formatted here like pgs_brief):
1.1 active+remapped+backfilling [0,1,5,3,4] 0 [0,1,2,3,4] 0
The question in my mind was: Where is the data for this backfill
coming from? In replicated pools, all reads come from the primary.
However, in this case, the primary does not have the data in question;
the primary has to either read the EC chunk from OSD 2, or it has to
reconstruct it by reading from 3 of the OSDs in the acting set.
Based on observation, I _think_ this is what happens:
1. As long as the PG is not degraded, the backfill read is simply
forwarded by the primary to OSD 2.
2. Once the PG becomes degraded, the backfill read needs to use the
reconstructing path, and begins reading from 3 of the OSDs in the
acting set.
Questions:
1. Can anyone confirm or correct my description of how EC backfill
operates? In particular, in case 2 above, does it matter whether OSD 2
is the cause of degradation, for example? Does the read still get
forwarded to a single OSD when it's parity chunks that are being moved
via backfill?
2. I'm curious as to why a 3rd reservation, for the source OSD, wasn't
introduced as a part of EC in Ceph. We've occasionally seen an OSD
become overloaded because several backfills were reading from it
simultaneously, and there's no way to control this via the normal
osd_max_backfills mechanism. Is anyone aware of discussions to this
effect?
Thanks!
Josh