Hey everyone,
On 20/10/2022 10:12, Christian Rohmann wrote:
> 1) May I bring up again my remarks about the timing:
>
> On 19/10/2022 11:46, Christian Rohmann wrote:
>
>> I believe the upload of a new release to the repo prior to the
>> announcement happens quite regularly - it might just be due to the
>> technical process of releasing.
>> But I agree it would be nice to have a more "bit flip" approach to
>> new releases in the repo and not have the packages appear as updates
>> prior to the announcement and final release and update notes.
> By my observations sometimes there are packages available on the
> download servers via the "last stable" folders such as
> https://download.ceph.com/debian-quincy/ quite some time before the
> announcement of a release is out.
> I know it's hard to time this right with mirrors requiring some time
> to sync files, but would be nice to not see the packages or have
> people install them before there are the release notes and potential
> pointers to changes out.
Todays 16.2.11 release shows the exact issue I described above ....
1) 16.2.11 packages are already available via e.g.
https://download.ceph.com/debian-pacific
2) release notes not yet merged:
(https://github.com/ceph/ceph/pull/49839), thus
https://ceph.io/en/news/blog/2022/v16-2-11-pacific-released/ show a 404 :-)
3) No announcement like
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/QOCU563UD3…
to the ML yet.
Regards
Christian
Hey ceph-users,
I setup a multisite sync between two freshly setup Octopus clusters.
In the first cluster I created a bucket with some data just to test the
replication of actual data later.
I then followed the instructions on
https://docs.ceph.com/en/octopus/radosgw/multisite/#migrating-a-single-site…
to add a second zone.
Things went well and both zones are now happily reaching each other and
the API endpoints are talking.
Also the metadata is in sync already - both sides are happy and I can
see bucket listings and users are "in sync":
> # radosgw-admin sync status
> realm 13d1b8cb-dc76-4aed-8578-2ce5d3d010e8 (obst)
> zonegroup 17a06c15-2665-484e-8c61-cbbb806e11d2 (obst-fra)
> zone 6d2c1275-527e-432f-a57a-9614930deb61 (obst-rgn)
> metadata sync no sync (zone is master)
> data sync source: c07447eb-f93a-4d8f-bf7a-e52fade399f3 (obst-az1)
> init
> full sync: 128/128 shards
> full sync: 0 buckets to sync
> incremental sync: 0/128 shards
> data is behind on 128 shards
> behind shards: [0...127]
>
and on the other side ...
> # radosgw-admin sync status
> realm 13d1b8cb-dc76-4aed-8578-2ce5d3d010e8 (obst)
> zonegroup 17a06c15-2665-484e-8c61-cbbb806e11d2 (obst-fra)
> zone c07447eb-f93a-4d8f-bf7a-e52fade399f3 (obst-az1)
> metadata sync syncing
> full sync: 0/64 shards
> incremental sync: 64/64 shards
> metadata is caught up with master
> data sync source: 6d2c1275-527e-432f-a57a-9614930deb61 (obst-rgn)
> init
> full sync: 128/128 shards
> full sync: 0 buckets to sync
> incremental sync: 0/128 shards
> data is behind on 128 shards
> behind shards: [0...127]
>
also the newly created buckets (read: their metadata) is synced.
What is apparently not working in the sync of actual data.
Upon startup the radosgw on the second site shows:
> 2021-06-25T16:15:06.445+0000 7fe71eff5700 1 RGW-SYNC:meta: start
> 2021-06-25T16:15:06.445+0000 7fe71eff5700 1 RGW-SYNC:meta: realm
> epoch=2 period id=f4553d7c-5cc5-4759-9253-9a22b051e736
> 2021-06-25T16:15:11.525+0000 7fe71dff3700 0
> RGW-SYNC:data:sync:init_data_sync_status: ERROR: failed to read remote
> data log shards
>
also when issuing
# radosgw-admin data sync init --source-zone obst-rgn
it throws
> 2021-06-25T16:20:29.167+0000 7f87c2aec080 0
> RGW-SYNC:data:init_data_sync_status: ERROR: failed to read remote data
> log shards
Does anybody have any hints on where to look for what could be broken here?
Thanks a bunch,
Regards
Christian
Bonjour,
Reading Karan's blog post about benchmarking the insertion of billions objects to Ceph via S3 / RGW[0] from last year, it reads:
> we decided to lower bluestore_min_alloc_size_hdd to 18KB and re-test. As represented in chart-5, the object creation rate found to be notably reduced after lowering the bluestore_min_alloc_size_hdd parameter from 64KB (default) to 18KB. As such, for objects larger than the bluestore_min_alloc_size_hdd , the default values seems to be optimal, smaller objects further require more investigation if you intended to reduce bluestore_min_alloc_size_hdd parameter.
There also is a mail thread dated 2018 on this topic as well, with the same conclusion although using RADOS directly and not RGW[3]. I read the RGW data layout page in the documentation[1] and concluded that by default every object inserted with S3 / RGW will indeed use at least 64kb. A pull request from last year[2] seems to confirm it and also suggests modifying bluestore_min_alloc_size_hdd has adverse side effects.
That being said, I'm curious to know if people developed strategies to cope with this overhead. Someone mentioned packing objects together client side to make them larger. But maybe there are simpler ways to do the same?
Cheers
[0] https://www.redhat.com/en/blog/scaling-ceph-billion-objects-and-beyond
[1] https://docs.ceph.com/en/latest/radosgw/layout/
[2] https://github.com/ceph/ceph/pull/32809
[3] https://www.spinics.net/lists/ceph-users/msg45755.html
--
Loïc Dachary, Artisan Logiciel Libre
On Thu, Dec 15, 2022 at 9:32 AM Stolte, Felix <f.stolte(a)fz-juelich.de> wrote:
>
> Hi Patrick,
>
> we used your script to repair the damaged objects on the weekend and it went smoothly. Thanks for your support.
>
> We adjusted your script to scan for damaged files on a daily basis, runtime is about 6h. Until thursday last week, we had exactly the same 17 Files. On thursday at 13:05 a snapshot was created and our active mds crashed once at this time (snapshot was created):
>
> 2022-12-08T13:05:48.919+0100 7f440afec700 -1 /build/ceph-16.2.10/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7f440afec700 time 2022-12-08T13:05:48.921223+0100
> /build/ceph-16.2.10/src/mds/ScatterLock.h: 59: FAILED ceph_assert(state LOCK_XLOCK || state LOCK_XLOCKDONE)
>
> 12 Minutes lates the unlink_local error crashes appeared again. This time with a new file. During debugging we noticed a MTU mismatch between MDS (1500) and client (9000) with cephfs kernel mount. The client is also creating the snapshots via mkdir in the .snap directory.
>
> We disabled snapshot creation for now, but really need this feature. I uploaded the mds logs of the first crash along with the information above to https://tracker.ceph.com/issues/38452
>
> I would greatly appreciate it, if you could answer me the following question:
>
> Is the Bug related to our MTU Mismatch? We fixed the MTU Issue going back to 1500 on all nodes in the ceph public network on the weekend also.
I doubt it.
> If you need a debug level 20 log of the ScatterLock for further analysis, i could schedule snapshots at the end of our workdays and increase the debug level 5 Minutes arround snap shot creation.
This would be very helpful!
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
Hi all,
our monitors have enjoyed democracy since the beginning. However, I don't share a sudden excitement about voting:
2/9/23 4:42:30 PM[INF]overall HEALTH_OK
2/9/23 4:42:30 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4)
2/9/23 4:42:26 PM[INF]mon.ceph-01 calling monitor election
2/9/23 4:42:26 PM[INF]mon.ceph-26 calling monitor election
2/9/23 4:42:26 PM[INF]mon.ceph-25 calling monitor election
2/9/23 4:42:26 PM[INF]mon.ceph-02 calling monitor election
2/9/23 4:40:00 PM[INF]overall HEALTH_OK
2/9/23 4:30:00 PM[INF]overall HEALTH_OK
2/9/23 4:24:34 PM[INF]overall HEALTH_OK
2/9/23 4:24:34 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4)
2/9/23 4:24:29 PM[INF]mon.ceph-01 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-02 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-03 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-01 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-26 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-25 calling monitor election
2/9/23 4:24:29 PM[INF]mon.ceph-02 calling monitor election
2/9/23 4:24:04 PM[INF]overall HEALTH_OK
2/9/23 4:24:03 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4)
2/9/23 4:23:59 PM[INF]mon.ceph-01 calling monitor election
2/9/23 4:23:59 PM[INF]mon.ceph-02 calling monitor election
2/9/23 4:20:00 PM[INF]overall HEALTH_OK
2/9/23 4:10:00 PM[INF]overall HEALTH_OK
2/9/23 4:00:00 PM[INF]overall HEALTH_OK
2/9/23 3:50:00 PM[INF]overall HEALTH_OK
2/9/23 3:43:13 PM[INF]overall HEALTH_OK
2/9/23 3:43:13 PM[INF]mon.ceph-01 is new leader, mons ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 in quorum (ranks 0,1,2,3,4)
2/9/23 3:43:08 PM[INF]mon.ceph-01 calling monitor election
2/9/23 3:43:08 PM[INF]mon.ceph-26 calling monitor election
2/9/23 3:43:08 PM[INF]mon.ceph-25 calling monitor election
We moved a switch from one rack to another and after the switch came beck up, the monitors frequently bitch about who is the alpha. How do I get them to focus more on their daily duties again?
Thanks for any help!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
I am running Ceph 15.2.13 on CentOS 7.9.2009 and recently my MDS servers
have started failing with the error message
In function 'void Server::handle_client_open(MDRequestRef&)' thread
7f0ca9908700 time 2021-06-28T09:21:11.484768+0200
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/15.2.13/rpm/el7/BUILD/ceph-15.2.13/src/mds/Server.cc:
4149: FAILED ceph_assert(cur->is_auth())
Complete log is:
https://gist.github.com/pvanheus/4da555a6de6b5fa5e46cbf74f5500fbd
ceph status output is:
# ceph status
cluster:
id: ed7b2c16-b053-45e2-a1fe-bf3474f90508
health: HEALTH_WARN
30 OSD(s) experiencing BlueFS spillover
insufficient standby MDS daemons available
1 MDSs report slow requests
2 mgr modules have failed dependencies
4347046/326505282 objects misplaced (1.331%)
6 nearfull osd(s)
23 pgs not deep-scrubbed in time
23 pgs not scrubbed in time
8 pool(s) nearfull
services:
mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 22m)
mgr: ceph-mon1(active, since 11w), standbys: ceph-mon2, ceph-mon3
mds: SANBI_FS:2 {0=ceph-mon1=up:active(laggy or
crashed),1=ceph-mon2=up:stopping}
osd: 54 osds: 54 up (since 2w), 54 in (since 11w); 50 remapped pgs
data:
pools: 8 pools, 833 pgs
objects: 42.37M objects, 89 TiB
usage: 159 TiB used, 105 TiB / 264 TiB avail
pgs: 4347046/326505282 objects misplaced (1.331%)
782 active+clean
49 active+clean+remapped
1 active+clean+scrubbing+deep
1 active+clean+remapped+scrubbing
io:
client: 29 KiB/s rd, 427 KiB/s wr, 37 op/s rd, 48 op/s wr
When restarting a MDS it goes through states replace, reconnect, resolve
and finally sets itself to active before this crash happens.
Any advice on what to do?
Thanks,
Peter
P.S. apologies if you received this email more than once - I have had some
trouble figuring out the correct mailing list to use.
Hi Team,
We have a ceph cluster with 3 storage nodes:
1. storagenode1 - abcd:abcd:abcd::21
2. storagenode2 - abcd:abcd:abcd::22
3. storagenode3 - abcd:abcd:abcd::23
The requirement is to mount ceph using the domain name of MON node:
Note: we resolved the domain name via DNS server.
For this we are using the command:
```
mount -t ceph [storagenode.storage.com]:6789:/ /backup -o
name=admin,secret=AQCM+8hjqzuZEhAAcuQc+onNKReq7MV+ykFirg==
```
We are getting the following logs in /var/log/messages:
```
Jan 24 17:23:17 localhost kernel: libceph: resolve 'storagenode.storage.com'
(ret=-3): failed
Jan 24 17:23:17 localhost kernel: libceph: parse_ips bad ip '
storagenode.storage.com:6789'
```
We also tried mounting ceph storage using IP of MON which is working fine.
Query:
Could you please help us out with how we can mount ceph using FQDN.
My /etc/ceph/ceph.conf is as follows:
[global]
ms bind ipv6 = true
ms bind ipv4 = false
mon initial members = storagenode1,storagenode2,storagenode3
osd pool default crush rule = -1
fsid = 7969b8a3-1df7-4eae-8ccf-2e5794de87fe
mon host =
[v2:[abcd:abcd:abcd::21]:3300,v1:[abcd:abcd:abcd::21]:6789],[v2:[abcd:abcd:abcd::22]:3300,v1:[abcd:abcd:abcd::22]:6789],[v2:[abcd:abcd:abcd::23]:3300,v1:[abcd:abcd:abcd::23]:6789]
public network = abcd:abcd:abcd::/64
cluster network = eff0:eff0:eff0::/64
[osd]
osd memory target = 4294967296
[client.rgw.storagenode1.rgw0]
host = storagenode1
keyring = /var/lib/ceph/radosgw/ceph-rgw.storagenode1.rgw0/keyring
log file = /var/log/ceph/ceph-rgw-storagenode1.rgw0.log
rgw frontends = beast endpoint=[abcd:abcd:abcd::21]:8080
rgw thread pool size = 512
--
~ Lokendra
skype: lokendrarathour
Hello,
What's the status with the *-stable-* tags?
https://quay.io/repository/ceph/daemon?tab=tags
No longer build/support?
What should we use until we'll migrate from ceph-ansible to cephadm?
Thanks.
--
Jonas
Hi,
today I did the first update from octopus to pacific, and it looks like the
avg apply latency went up from 1ms to 2ms.
All 36 OSDs are 4TB SSDs and nothing else changed.
Someone knows if this is an issue, or am I just missing a config value?
Cheers
Boris