I had to temporarily disconnect the network on my entire Ceph cluster, so I
prepared the cluster by following what appears to be some incomplete
advice.
I did the following before disconnecting the network:
#ceph osd set noout
#ceph osd set norecover
#ceph osd set norebalance
#ceph osd set nobackfill
#ceph osd set nodown
#ceph osd set pause
Now, all the ceph services are still running, but I cannot undo any flags:
root@proxmox01:~# ceph osd unset pause
2024-02-22T13:16:02.220+0000 7f0aab5a26c0 0 monclient(hunting):
authenticate timed out after 300
[errno 110] RADOS timed out (error connecting to the cluster)
Any advice on how to recover would be greatly appreciated.
Thank you,
-Chip
Update: we have run fsck and re-shard on all bluestore volume, seems sharding were not applied.
Unfortunately scrubs and deep-scrubs are still stuck on PGs of the pool that is suffering the issue, but other PGs scrubs well.
The next step will be to remove the cache tier as suggested, but its not available yet as PGs needs to be scrubbed in order for the cache tier can be activated.
As we are struggling to make this cluster works again, any help would be greatly appreciated.
Cédric
> On 20 Feb 2024, at 20:22, Cedric <yipikai7(a)gmail.com> wrote:
>
> Thanks Eugen, sorry about the missed reply to all.
>
> The reason we still have the cache tier is because we were not able to flush all dirty entry to remove it (as per the procedure), so the cluster as been migrated from HDD/SSD to NVME a while ago but tiering remains, unfortunately.
>
> So actually we are trying to understand the root cause
>
> On Tue, Feb 20, 2024 at 1:43 PM Eugen Block <eblock(a)nde.ag> wrote:
>>
>> Please don't drop the list from your response.
>>
>> The first question coming to mind is, why do you have a cache-tier if
>> all your pools are on nvme decices anyway? I don't see any benefit here.
>> Did you try the suggested workaround and disable the cache-tier?
>>
>> Zitat von Cedric <yipikai7(a)gmail.com>:
>>
>>> Thanks Eugen, see attached infos.
>>>
>>> Some more details:
>>>
>>> - commands that actually hangs: ceph balancer status ; rbd -p vms ls ;
>>> rados -p vms_cache cache-flush-evict-all
>>> - all scrub running on vms_caches pgs are stall / start in a loop
>>> without actually doing anything
>>> - all io are 0 both from ceph status or iostat on nodes
>>>
>>> On Tue, Feb 20, 2024 at 10:00 AM Eugen Block <eblock(a)nde.ag> wrote:
>>>>
>>>> Hi,
>>>>
>>>> some more details would be helpful, for example what's the pool size
>>>> of the cache pool? Did you issue a PG split before or during the
>>>> upgrade? This thread [1] deals with the same problem, the described
>>>> workaround was to set hit_set_count to 0 and disable the cache layer
>>>> until that is resolved. Afterwards you could enable the cache layer
>>>> again. But keep in mind that the code for cache tier is entirely
>>>> removed in Reef (IIRC).
>>>>
>>>> Regards,
>>>> Eugen
>>>>
>>>> [1]
>>>> https://ceph-users.ceph.narkive.com/zChyOq5D/ceph-strange-issue-after-addin…
>>>>
>>>> Zitat von Cedric <yipikai7(a)gmail.com>:
>>>>
>>>>> Hello,
>>>>>
>>>>> Following an upgrade from Nautilus (14.2.22) to Pacific (16.2.13), we
>>>>> encounter an issue with a cache pool becoming completely stuck,
>>>>> relevant messages below:
>>>>>
>>>>> pg xx.x has invalid (post-split) stats; must scrub before tier agent
>>>>> can activate
>>>>>
>>>>> In OSD logs, scrubs are starting in a loop without succeeding for all
>>>>> pg of this pool.
>>>>>
>>>>> What we already tried without luck so far:
>>>>>
>>>>> - shutdown / restart OSD
>>>>> - rebalance pg between OSD
>>>>> - raise the memory on OSD
>>>>> - repeer PG
>>>>>
>>>>> Any idea what is causing this? any help will be greatly appreciated
>>>>>
>>>>> Thanks
>>>>>
>>>>> Cédric
>>>>> _______________________________________________
>>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>>
>>
Hi Folks,
We are excited to announce plans for building a larger Ceph-S3 setup.
To ensure its success, extensive testing is needed in advance.
Some of these tests don't need a full-blown Ceph cluster on hardware
but still require meeting specific logical requirements, such as a
multi-site S3 setup. To address this, we're pleased to introduce our
ceph-s3-box test environment, which you can access on GitHub:
https://github.com/hetznercloud/ceph-s3-box
In the spirit of collaboration and knowledge sharing, we've made this
testing environment publicly available today. We hope that it proves
as beneficial to you as it has been for us.
If you have any questions or suggestions, please don't hesitate to reach out.
Cheers,
Ansgar
Hello,
Following an upgrade from Nautilus (14.2.22) to Pacific (16.2.13), we
encounter an issue with a cache pool becoming completely stuck,
relevant messages below:
pg xx.x has invalid (post-split) stats; must scrub before tier agent
can activate
In OSD logs, scrubs are starting in a loop without succeeding for all
pg of this pool.
What we already tried without luck so far:
- shutdown / restart OSD
- rebalance pg between OSD
- raise the memory on OSD
- repeer PG
Any idea what is causing this? any help will be greatly appreciated
Thanks
Cédric
good morning,
i am trying to understand ceph snapshot sizing. For example if i have 2.7
GB volume and i create a snap on it, the sizing says:
(BEFORE SNAP)
rbd du volumes/volume-d954915c-1dc1-41cb-8bf0-0c67e7b6e080
NAME PROVISIONED USED
volume-d954915c-1dc1-41cb-8bf0-0c67e7b6e080 10 GiB 2.7 GiB
(AFTER SNAP)
rbd du volumes/volume-d954915c-1dc1-41cb-8bf0-0c67e7b6e080
NAME PROVISIONED USED
volume-d954915c-1dc1-41cb-8bf0-0c67e7b6e080@snap01 10 GiB 2.7 GiB
volume-d954915c-1dc1-41cb-8bf0-0c67e7b6e080 10 GiB 0 B
<TOTAL> 10 GiB 2.7 GiB
why the SNAP is 2.7 GB? is not going to be 0 GB in the beginning and only
after the COW start doying its thing (copying original write blocks to snap
before overwrite with new ones) it should grow?
am i wrong?
thank you.
Hi folks,
i just try to setup a new ceph s3 multisite-setup and it looks to me
that dns-style s3 is broken in multi-side as wehn rgw_dns_name is
configured the `radosgw-admin period update -commit`from the new mebe
will not succeeded!
it looks like when ever hostnames is configured it brakes on the new
to add cluster
https://docs.ceph.com/en/reef/radosgw/multisite/#setting-a-zonegroup
Thanks for any advice!
Ansgar
Hello,
I deployed RGW and NFSGW services over a ceph (version 17.2.6) cluster. Both services are being accessed using 2 (separated) ingresses, actually working as expected when contacted by clients.
Besides, I’m experiencing some problem while letting the ingresses work on the same cluster.
keepalived logs are full of "(VI_0) received an invalid passwd!” lines, because both ingresses are using the same virtualrouter id, so I’m trying to introduce some additional parameter in service definition manifests to workaround the problem (first_virtual_router_id, default value is 50), below are the manifest content:
service_type: ingress
service_id: ingress.rgw
service_name: ingress.rgw
placement:
hosts:
- c00.domain.org
- c01.domain.org
- c02.domain.org
spec:
backend_service: rgw.rgw
frontend_port: 8080
monitor_port: 1967
virtual_ips_list:
- X.X.X.200/24
first_virtual_router_id: 60
service_type: ingress
service_id: nfs.nfsgw
service_name: ingress.nfs.nfsgw
placement:
count: 2
spec:
backend_service: nfs.nfsgw
frontend_port: 2049
monitor_port: 9049
virtual_ip: X.X.X.222/24
first_virtual_router_id: 70
When I apply the manifests I’m getting the error, for both ingress definitions:
Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument ‘first_virtual_router_id'
even the documentation for quincy version describes the option and includes some similar example at: https://docs.ceph.com/en/quincy/cephadm/services/rgw
Both manifests are working smoothly if I remove the first_virtual_router_id line.
Any ideas on how I can troubleshoot the issue?
Thanks in advance
Ramon
--
Ramon Orrù
Servizio di Calcolo
Laboratori Nazionali di Frascati
Istituto Nazionale di Fisica Nucleare
Via E. Fermi, 54 - 00044 Frascati (RM) Italy
Tel. +39 06 9403 2345
Estimate on release timeline for 17.2.8?
- after pacific 16.2.15 and reef 18.2.2 hotfix
(https://tracker.ceph.com/issues/64339,
https://tracker.ceph.com/issues/64406)
Estimate on release timeline for 19.2.0?
- target April, depending on testing and RCs
- Testing plan for Squid beyond dev freeze (regression and upgrade
tests, performance tests, RCs)
Can we fix old.ceph.com?
- continued discussion about the need to revive the pg calc tool
T release name?
- please add and vote for suggestions in https://pad.ceph.com/p/t
- need name before we can open "t kickoff" pr
I have logged this as https://tracker.ceph.com/issues/64213
On 16/01/2024 14:18, DERUMIER, Alexandre wrote:
> Hi,
>
>>> ImportError: PyO3 modules may only be initialized once per
>>> interpreter
>>> process
>>>
>>> and ceph -s reports "Module 'dashboard' has failed dependency: PyO3
>>> modules may only be initialized once per interpreter process
> We have the same problem on proxmox8 (based on debian12) with ceph
> quincy or reef.
>
> It seem to be related to python version on debian12
>
> (we have no fix for this currently)
>
>
>
Hi everyone,
You are invited to join us at the User + Dev meeting this week Thursday,
February 22 at 10:00 AM Eastern Time!
Focus Topic: CephFS Snapshots Evaluation
Presented by: Enrico Bocchi and Abhishek Lekshmanan, Ceph operators from
CERN
From the presenters:
Ceph at CERN provides block, object, and file storage backing the IT
infrastructure of the Organization. CephFS, in particular, is largely used
through the integration with OpenStack Manila by container-based workloads
(Kubernetes, OpenShift), HPC MPI clusters, and as a general-purpose
networked file system for enterprise groupware and open infrastructure
technologies (code/software repositories, monitoring, analytics, etc.).
Our presentation focuses on CephFS snapshots and their implications on
performance and stability. Snapshots would be a valuable addition to our
existing CephFS service, as they allow for storage rollback and disaster
recovery through mirroring. According to our observations, however, they
introduce a non-negligible performance penalty and may jeopardize the
stability of the file system.
In particular, we would like to discuss:
1. Experiences with CephFS snapshots from other operators in the Ceph
community.
2. Tools and strategies one can deploy to pre-empty or mitigate issues.
3. How to effectively contribute with upstream developers and interested
community users to address the identified limitations.
Feel free to add questions or additional topics under the "Open Discussion"
section on the agenda: https://pad.ceph.com/p/ceph-user-dev-monthly-minutes
If you have an idea for a focus topic you'd like to present at a future
meeting, you are welcome to submit it to this Google Form:
https://docs.google.com/forms/d/e/1FAIpQLSdboBhxVoBZoaHm8xSmeBoemuXoV_rmh4v…
Any Ceph user or developer is eligible to submit!
Thanks,
Neha