Yes, this is all set up. It was working fine until after the problem
with the osd host that lost the cluster/sync network occured.
There are a few other VMs that keep running along fine without this
issue. I've restarted the problematic VM without success (that is,
creating a file works, but overwriting it still hangs right away). fsck
runs fine so reading the whole image works.
I'm kind of stumped as to what can cause this.
Because of the lengthy recovery, and then pg autoscaler currenty doing
things there are currently lots of PGs that haven't been scrubbed, but I
doubt that is an issue here.
Den 2023-09-29 kl. 18:52, skrev Anthony D'Atri:
> EC for RBD wasn't possible until Luminous IIRC, so I had to ask. You have a replicated metadata pool defined? Does proxmox know that this is an EC pool? When connecting it needs to know both the metata and data pools.
>
>> On Sep 29, 2023, at 12:49, peter.linder(a)fiberdirekt.se wrote:
>>
>> (sorry for duplicate emails)
>>
>> This turns out to be a good question actually.
>>
>> The cluster is running Quincy, 17.2.6.
>>
>> The compute node that is running the VM is proxmox, version 7.4-3. Supposedly this is fairly new, but the version of librbd1 claims to be 14.2.21 when I check with "apt list". We are not using proxmox's own ceph cluster release. However we haven't had any issues with this setup before, but we haven't been using neither erasure coded pools nor had the node-half-dead problem for such a long time.
>>
>> The VM is configured using proxmox which is not libvirt but similar, and krbd is not enabled. I don't know for sure if proxmox has its own librbd linked in qemu/kvm.
>>
>> "ceph features" looks like this:
>>
>> {
>> "mon": [
>> {
>> "features": "0x3f01cfbf7ffdffff",
>> "release": "luminous",
>> "num": 5
>> }
>> ],
>> "osd": [
>> {
>> "features": "0x3f01cfbf7ffdffff",
>> "release": "luminous",
>> "num": 24
>> }
>> ],
>> "client": [
>> {
>> "features": "0x3f01cfb87fecffff",
>> "release": "luminous",
>> "num": 4
>> },
>> {
>> "features": "0x3f01cfbf7ffdffff",
>> "release": "luminous",
>> "num": 12
>> }
>> ],
>> "mgr": [
>> {
>> "features": "0x3f01cfbf7ffdffff",
>> "release": "luminous",
>> "num": 2
>> }
>> ]
>> }
>>
>> Regards,
>>
>> Peter
>>
>>
>> Den 2023-09-29 kl. 17:55, skrev Anthony D'Atri:
>>> Which Ceph releases are installed on the VM and the back end? Is the VM using librbd through libvirt, or krbd?
>>>
>>>> On Sep 29, 2023, at 09:09, Peter Linder <peter.linder(a)fiberdirekt.se> wrote:
>>>>
>>>> Dear all,
>>>>
>>>> I have a problem that after an OSD host lost connection to the sync/cluster rear network for many hours (the public network was online), a test VM using RBD cant overwrite its files. I can create a new file inside it just fine, but not overwrite it, the process just hangs.
>>>>
>>>> The VM's disk is on an erasure coded data pool and a replicated pool in front of it. EC overwrites is on for the pool.
>>>>
>>>> The custer consists of 5 hosts and 4 OSDs on each, and separate hosts for compute. There is a public and separate cluster network, separated. In this case, the AOC cable to the cluster network went link down on a host and it had to be replaced and the host was rebooted. Recovery took about a week to complete. The host was half-down for about 12 hours like this.
>>>>
>>>> I have some other VMs as well with images in the same pool (totally 4), and they seem to work fine, it is just this one that cant overwrite.
>>>>
>>>> I'm thinking there is somehow something wrong with just this image?
>>>>
>>>> Regards,
>>>>
>>>> Peter
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
Did you apply the changes to the containers.conf file on all hosts?
The MGR daemon is issuing the cephadm commands on the remote hosts, so
it would need that as well. That setup works for me quite well for
years now. What distro is your host running on? We mostly use openSUSE
or SLES, but I also had that successfully running on Ubuntu VMs the
same way.
Zitat von Majid Varzideh <m.varzideh(a)gmail.com>:
> Thanks for your responses.
> locally it is ok and i can get images without any issue. Also I set proxy
> parameters in containers.conf but no result.
> it seems that env proxy parameters are not known to user issuing get
> commands,
>
>
> On Wed, Sep 27, 2023 at 11:00 AM Eugen Block <eblock(a)nde.ag> wrote:
>
>> You'll need a containers.conf file:
>>
>> # cat /etc/containers/containers.conf
>> [engine]
>> env = ["http_proxy=<IP>:<PORT>", "https_proxy=<IP>:<PORT>",
>> "no_proxy=localhost"]
>>
>> Restarting the container should apply the change. Make sure you also
>> have the correct no_proxy settings, for example so the ceph servers
>> don't communicate via proxy with each other.
>>
>> Zitat von Majid Varzideh <m.varzideh(a)gmail.com>:
>>
>> > hi friends
>> > i have deployed my first node in cluster. we dont have direct internet on
>> > my server so i have to set proxy for that.i set it /etc/environment
>> > /etc/profile but i get bellow error
>> > 2023-09-26 17:09:38,254 7f04058b4b80 DEBUG
>> >
>> --------------------------------------------------------------------------------
>> > cephadm ['--image', 'quay.io/ceph/ceph:v17', 'pull']
>> > 2023-09-26 17:09:38,302 7f04058b4b80 INFO Pulling container image
>> > quay.io/ceph/ceph:v17...
>> > 2023-09-26 17:09:42,083 7f04058b4b80 INFO Non-zero exit code 125 from
>> > /usr/bin/podman pull quay.io/ceph/ceph:v17
>> > 2023-09-26 17:09:42,083 7f04058b4b80 INFO /usr/bin/podman: stderr Trying
>> to
>> > pull quay.io/ceph/ceph:v17...
>> > 2023-09-26 17:09:42,084 7f04058b4b80 INFO /usr/bin/podman: stderr
>> > time="2023-09-26T17:09:38+03:30" level=warning msg="Failed, retrying in
>> 1s
>> > ... (1/3). Error: initializing source docker://quay.io/ceph/ceph:v17:
>> > pinging container registry quay.io: Get \"https://quay.io/v2/\": dial
>> tcp
>> > 34.228.154.221:443: connect: connection refused"
>> > 2023-09-26 17:09:42,084 7f04058b4b80 INFO /usr/bin/podman: stderr
>> > time="2023-09-26T17:09:39+03:30" level=warning msg="Failed, retrying in
>> 1s
>> > ... (2/3). Error: initializing source docker://quay.io/ceph/ceph:v17:
>> > pinging container registry quay.io: Get \"https://quay.io/v2/\": dial
>> tcp
>> > 3.220.246.53:443: connect: connection refused"
>> > 2023-09-26 17:09:42,084 7f04058b4b80 INFO /usr/bin/podman: stderr
>> > time="2023-09-26T17:09:40+03:30" level=warning msg="Failed, retrying in
>> 1s
>> > ... (3/3). Error: initializing source docker://quay.io/ceph/ceph:v17:
>> > pinging container registry quay.io: Get \"https://quay.io/v2/\": dial
>> tcp
>> > 18.213.60.205:443: connect: connection refused"
>> > 2023-09-26 17:09:42,084 7f04058b4b80 INFO /usr/bin/podman: stderr Error:
>> > initializing source docker://quay.io/ceph/ceph:v17: pinging container
>> > registry quay.io: Get "https://quay.io/v2/": dial tcp 34.231.182.47:443:
>> > connect: connection refused
>> > 2023-09-26 17:09:42,084 7f04058b4b80 ERROR ERROR: Failed command:
>> > /usr/bin/podman pull quay.io/ceph/ceph:v17
>> >
>> >
>> > would you please help.
>> > thanks,
>> > _______________________________________________
>> > ceph-users mailing list -- ceph-users(a)ceph.io
>> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
Hi,
see below for details of warnings.
the cluster is running 17.2.5. the warnings have been around for a while.
one concern of mine is num_segments growing over time. clients with
warn of MDS_CLIENT_OLDEST_TID
increase from 18 to 25 as well. The nodes are with kernel
4.19.0-91.82.42.uelc20.x86_64.
It looks like bugs with client library. And rebooting nodes with problem
will fix it for short period of time? Any suggestions from community for
fixing?
Thanks,
Ben
[root@8cd2c0657c77 /]# ceph health detail
HEALTH_WARN 6 hosts fail cephadm check; 2 clients failing to respond to
capability release; 25 clients failing to advance oldest client/flush tid;
3 MDSs report slow requests; 3 MDSs behind on trimming
[WRN] CEPHADM_HOST_CHECK_FAILED: 6 hosts fail cephadm check
host host15w (192.168.31.33) failed check: Unable to reach remote host
host15w. Process exited with non-zero exit status 1
host host20w (192.168.31.38) failed check: Unable to reach remote host
host20w. Process exited with non-zero exit status 1
host host19w (192.168.31.37) failed check: Unable to reach remote host
host19w. Process exited with non-zero exit status 1
host host17w (192.168.31.35) failed check: Unable to reach remote host
host17w. Process exited with non-zero exit status 1
host host18w (192.168.31.36) failed check: Unable to reach remote host
host18w. Process exited with non-zero exit status 1
host host16w (192.168.31.34) failed check: Unable to reach remote host
host16w. Process exited with non-zero exit status 1
[WRN] MDS_CLIENT_LATE_RELEASE: 2 clients failing to respond to capability
release
mds.code-store.host18w.fdsqff(mds.1): Client k8s-node36 failing to
respond to capability release client_id: 460983
mds.code-store.host16w.vucirx(mds.3): Client failing to respond to
capability release client_id: 460983
[WRN] MDS_CLIENT_OLDEST_TID: 25 clients failing to advance oldest
client/flush tid
mds.code-store.host18w.fdsqff(mds.1): Client k8s-node36 failing to
advance its oldest client/flush tid. client_id: 460983
mds.code-store.host18w.fdsqff(mds.1): Client failing to advance its
oldest client/flush tid. client_id: 460226
mds.code-store.host18w.fdsqff(mds.1): Client k8s-node32 failing to
advance its oldest client/flush tid. client_id: 239797
mds.code-store.host15w.reolpx(mds.5): Client k8s-node34 failing to
advance its oldest client/flush tid. client_id: 460226
mds.code-store.host15w.reolpx(mds.5): Client k8s-node32 failing to
advance its oldest client/flush tid. client_id: 239797
mds.code-store.host15w.reolpx(mds.5): Client failing to advance its
oldest client/flush tid. client_id: 460983
mds.code-store.host18w.rtyvdy(mds.7): Client k8s-node34 failing to
advance its oldest client/flush tid. client_id: 460226
mds.code-store.host18w.rtyvdy(mds.7): Client failing to advance its
oldest client/flush tid. client_id: 239797
mds.code-store.host18w.rtyvdy(mds.7): Client k8s-node36 failing to
advance its oldest client/flush tid. client_id: 460983
mds.code-store.host17w.kcdopb(mds.2): Client failing to advance its
oldest client/flush tid. client_id: 239797
mds.code-store.host17w.kcdopb(mds.2): Client failing to advance its
oldest client/flush tid. client_id: 460983
mds.code-store.host17w.kcdopb(mds.2): Client k8s-node34 failing to
advance its oldest client/flush tid. client_id: 460226
mds.code-store.host17w.kcdopb(mds.2): Client k8s-node24 failing to
advance its oldest client/flush tid. client_id: 12072730
mds.code-store.host20w.bfoftp(mds.4): Client k8s-node32 failing to
advance its oldest client/flush tid. client_id: 239797
mds.code-store.host20w.bfoftp(mds.4): Client k8s-node36 failing to
advance its oldest client/flush tid. client_id: 460983
mds.code-store.host19w.ywrmiz(mds.6): Client k8s-node24 failing to
advance its oldest client/flush tid. client_id: 12072730
mds.code-store.host19w.ywrmiz(mds.6): Client k8s-node34 failing to
advance its oldest client/flush tid. client_id: 460226
mds.code-store.host19w.ywrmiz(mds.6): Client failing to advance its
oldest client/flush tid. client_id: 239797
mds.code-store.host19w.ywrmiz(mds.6): Client failing to advance its
oldest client/flush tid. client_id: 460983
mds.code-store.host16w.vucirx(mds.3): Client failing to advance its
oldest client/flush tid. client_id: 460983
mds.code-store.host16w.vucirx(mds.3): Client failing to advance its
oldest client/flush tid. client_id: 460226
mds.code-store.host16w.vucirx(mds.3): Client failing to advance its
oldest client/flush tid. client_id: 239797
mds.code-store.host17w.pdziet(mds.0): Client k8s-node32 failing to
advance its oldest client/flush tid. client_id: 239797
mds.code-store.host17w.pdziet(mds.0): Client k8s-node34 failing to
advance its oldest client/flush tid. client_id: 460226
mds.code-store.host17w.pdziet(mds.0): Client k8s-node36 failing to
advance its oldest client/flush tid. client_id: 460983
[WRN] MDS_SLOW_REQUEST: 3 MDSs report slow requests
mds.code-store.host15w.reolpx(mds.5): 4 slow requests are blocked > 5
secs
mds.code-store.host20w.bfoftp(mds.4): 6 slow requests are blocked > 5
secs
mds.code-store.host16w.vucirx(mds.3): 97 slow requests are blocked > 5
secs
[WRN] MDS_TRIM: 3 MDSs behind on trimming
mds.code-store.host15w.reolpx(mds.5): Behind on trimming (25831/128)
max_segments: 128, num_segments: 25831
mds.code-store.host20w.bfoftp(mds.4): Behind on trimming (27605/128)
max_segments: 128, num_segments: 27605
mds.code-store.host16w.vucirx(mds.3): Behind on trimming (28676/128)
max_segments: 128, num_segments: 28676
Hi Matthew,
At least for nautilus (14.2.22) i have discovered through trial and
error that you need to specify a beginning or end date. Something like
this:
radosgw-admin sync error trim --end-date="2023-08-20 23:00:00"
--rgw-zone={your_zone_name}
I specify the zone as there's a error list for each zone.
Hopefully that helps.
Rich
------------------------------
Date: Sat, 19 Aug 2023 12:48:55 -0400
From: Matthew Darwin <bugs(a)mdarwin.ca>
Subject: [ceph-users] radosgw-admin sync error trim seems to do
nothing
To: Ceph Users <ceph-users(a)ceph.io>
Message-ID: <95e7edfd-ca29-fc0e-a30a-987f1c43e2d4(a)mdarwin.ca>
Content-Type: text/plain; charset=UTF-8; format=flowed
Hello all,
"radosgw-admin sync error list" returns errors from 2022. I want to
clear those out.
I tried "radosgw-admin sync error trim" but it seems to do nothing.
The man page seems to offer no suggestions
https://protect-au.mimecast.com/s/26o0CzvkGRhLoOXfXjZR3?domain=docs.ceph.com
Any ideas what I need to do to remove old errors? (or at least I want
to see more recent errors)
ceph version 17.2.6 (quincy)
Thanks.
Hi
I'm trying to mark one OSD as down, so we can clean it out and replace
it. It keeps getting medium read errors, so it's bound to fail sooner
rather than later. When I command ceph from the mon to mark the osd
down, it doesn't actually do it. When the service on the osd stops, it
is also marked out and I'm thinking (but perhaps incorrectly?) that it
would be good to keep the OSD down+in, to try to read from it as long as
possible. Why doesn't it get marked down and stay that way when I
command it?
Context: Our cluster is in a bit of a less optimal state (see below),
this is after one of OSD nodes had failed and took a week to get back up
(long story). Due to a seriously unbalanced filling of our OSDs we kept
having to reweight OSDs to keep below the 85% threshold. Several disks
are starting to fail now (they're 4+ years old and failures are expected
to occur more frequently).
I'm open to suggestions to help get us back to health_ok more quickly,
but I think we'll get there eventually anyway...
Cheers
/Simon
----
# ceph -s
cluster:
health: HEALTH_ERR
1 clients failing to respond to cache pressure
1/843763422 objects unfound (0.000%)
noout flag(s) set
14 scrub errors
Possible data damage: 1 pg recovery_unfound, 1 pg inconsistent
Degraded data redundancy: 13795525/7095598195 objects
degraded (0.194%), 13 pgs degraded, 12 pgs undersized
70 pgs not deep-scrubbed in time
65 pgs not scrubbed in time
services:
mon: 3 daemons, quorum cephmon3,cephmon1,cephmon2 (age 11h)
mgr: cephmon3(active, since 35h), standbys: cephmon1
mds: 1/1 daemons up, 1 standby
osd: 264 osds: 264 up (since 2m), 264 in (since 75m); 227 remapped pgs
flags noout
rgw: 8 daemons active (4 hosts, 1 zones)
data:
volumes: 1/1 healthy
pools: 15 pools, 3681 pgs
objects: 843.76M objects, 1.2 PiB
usage: 2.0 PiB used, 847 TiB / 2.8 PiB avail
pgs: 13795525/7095598195 objects degraded (0.194%)
54839263/7095598195 objects misplaced (0.773%)
1/843763422 objects unfound (0.000%)
3374 active+clean
195 active+remapped+backfill_wait
65 active+clean+scrubbing+deep
20 active+remapped+backfilling
11 active+clean+snaptrim
10 active+undersized+degraded+remapped+backfill_wait
2 active+undersized+degraded+remapped+backfilling
2 active+clean+scrubbing
1 active+recovery_unfound+degraded
1 active+clean+inconsistent
progress:
Global Recovery Event (8h)
[==========================..] (remaining: 2h)
Hi all,
I have a Ceph cluster on Quincy (17.2.6), with 3 pools (1 rbd and 1
CephFS volume), each configured with 3 replicas.
$ sudo ceph osd pool ls detail
pool 7 'cephfs_data_home' replicated size 3 min_size 2 crush_rule 1
object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode on
last_change 6287147 lfor 0/5364613/5364611 flags hashpspool stripe_width
0 application cephfs
pool 8 'cephfs_metadata_home' replicated size 3 min_size 2 crush_rule 3
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change
6333341 lfor 0/6333341/6333339 flags hashpspool stripe_width 0
application cephfs
pool 9 'rbd_backup_vms' replicated size 3 min_size 2 crush_rule 2
object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode on
last_change 6365131 lfor 0/211948/249421 flags
hashpspool,selfmanaged_snaps stripe_width 0 application rbd
pool 10 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1 pgp_num 1 autoscale_mode warn last_change 6365131
flags hashpspool stripe_width 0 pg_num_min 1 application
mgr,mgr_devicehealth
$ sudo ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 306 TiB 186 TiB 119 TiB 119 TiB 39.00
nvme 4.4 TiB 4.3 TiB 118 GiB 118 GiB 2.63
TOTAL 310 TiB 191 TiB 119 TiB 119 TiB 38.49
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
cephfs_data_home 7 512 12 TiB 28.86M 12 TiB 12.85 27 TiB
cephfs_metadata_home 8 32 33 GiB 3.63M 33 GiB 0.79 1.3 TiB
rbd_backup_vms 9 1024 24 TiB 6.42M 24 TiB 58.65 5.6 TiB
.mgr 10 1 35 MiB 9 35 MiB 0 12 TiB
I am going to extend the rbd pool (rbd_backup_vms), currently used at 60%.
This pool contains 60 disks, i.e. 20 disks by rack in the crushmap. This
pool is used for storing VM disk images (available to a separate
ProxmoxVE cluster)
For this purpose, I am going to add 42 disks of the same size as those
currently in the pool, i.e. 14 additional disks on each rack.
Currently, this pool is configured with 1024 pgs.
Before this operation, I would like to extend the number of pgs, let's
say 2048 (i.e. double).
I wonder about the overall impact of this change on the cluster. I guess
that the heavy moves in the pgs will have a strong impact regarding the
iops?
I have two questions:
1) Is it useful to make this modification before adding the new OSDs?
(I'm afraid of warnings about full or nearfull pgs if not)
2) are there any configuration recommendations in order to minimize
these anticipated impacts?
Thank you!
Cheers,
Hervé
Hi,
I'm running a Ceph 17.2.5 Rados Gateway and I have a user with more than
1000 buckets.
When the client tries to list all their buckets using s3cmd, rclone and
python boto3, they all three only ever return the first 1000 bucket names.
I can confirm the buckets are all there (and more than 1000) by checking
with the radosgw-admin command.
Have I missed a pagination limit for listing user buckets in the rados
gateway?
Thanks,
Tom
Hello,
I have a haproxy problem in ceph quincy 17.2.6. Ununtu 22.04
The haproxy image can't be up after I specify the haproxy.cfg, and there is no error in the logs.
I set the haproxy.cfg: ceph config-key set mgr/cephadm/services/ingress/haproxy.cfg -i haproxy.cfg
If I remove the haproxy, and let cephadm to generate automatically, it works. I tried to create a file same as the cephadm generated, but haproxy were still down.
NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID
haproxy.rgw.foo.test01.pfiiix test01 *:80,9101 error 2m ago 6h - - <unknown> <unknown> <unknown>
haproxy.rgw.foo.test02.fsnhnb test02 *:80,9101 error 2m ago 6h - - <unknown> <unknown> <unknown>
Any advice on what to do?
Thanks,
Jie
The second issue of "Ceph Quarterly" is attached to this email. Ceph Quarterly (or "CQ") is an overview of the past three months of upstream Ceph development. We provide CQ in three formats: A4, letter, and plain text wrapped at 80 columns.
Two news items arrived after the deadline for typesetting this issue. They are included here:
Grace Hopper Open Source Day 2023:
- On 22 Sep 2023, Ceph participated in Grace Hopper Open Source Day, an all-day hackathon for women and nonbinary developers. Laura Flores led the Ceph division, and Yaarit Hatuka, Shreyansh Sancheti,and Aishwarya Mathuria participated as mentors. From 12pm EST to 7:30pm EST, Laura showed more than 40 attendees how to run a Ceph vstart cluster in an Ubuntu Docker container. Yaarit, Shreyansh,and Aishwarya spent the day working one-on-one with attendees, helping them troubleshoot and work through a curated list of low-hanging-fruit issues. By the day's end, Grace Hopper attendees submitted eight pull requests. As of the publication of this sentence, two have been merged and the others are expected to be merged soon.
- For more information about GHC Open Source Day, see https://ghc.anitab.org/awards-programs/open-source-day/
Ceph partners with RCOS:
- Ceph has partnered for the first time with the Rensselaer Center for Open Source (RCOS), an organization at Rensselaer Polytechnic Institute that helps students jumpstart their careers in software by giving them the opportunity to work on various open source projects for class credit.
- Laura Flores, representing Ceph, is mentoring three RPI students on a project to improve the output of the `ceph balancer status` command.
- For more information about RCOS, see https://rcos.io/
Zac Dover
Upstream DocumentationCeph Foundation