Hi Team,
We have a ceph cluster with 3 storage nodes:
1. storagenode1 - abcd:abcd:abcd::21
2. storagenode2 - abcd:abcd:abcd::22
3. storagenode3 - abcd:abcd:abcd::23
The requirement is to mount ceph using the domain name of MON node:
Note: we resolved the domain name via DNS server.
For this we are using the command:
```
mount -t ceph [storagenode.storage.com]:6789:/ /backup -o
name=admin,secret=AQCM+8hjqzuZEhAAcuQc+onNKReq7MV+ykFirg==
```
We are getting the following logs in /var/log/messages:
```
Jan 24 17:23:17 localhost kernel: libceph: resolve 'storagenode.storage.com'
(ret=-3): failed
Jan 24 17:23:17 localhost kernel: libceph: parse_ips bad ip '
storagenode.storage.com:6789'
```
We also tried mounting ceph storage using IP of MON which is working fine.
Query:
Could you please help us out with how we can mount ceph using FQDN.
My /etc/ceph/ceph.conf is as follows:
[global]
ms bind ipv6 = true
ms bind ipv4 = false
mon initial members = storagenode1,storagenode2,storagenode3
osd pool default crush rule = -1
fsid = 7969b8a3-1df7-4eae-8ccf-2e5794de87fe
mon host =
[v2:[abcd:abcd:abcd::21]:3300,v1:[abcd:abcd:abcd::21]:6789],[v2:[abcd:abcd:abcd::22]:3300,v1:[abcd:abcd:abcd::22]:6789],[v2:[abcd:abcd:abcd::23]:3300,v1:[abcd:abcd:abcd::23]:6789]
public network = abcd:abcd:abcd::/64
cluster network = eff0:eff0:eff0::/64
[osd]
osd memory target = 4294967296
[client.rgw.storagenode1.rgw0]
host = storagenode1
keyring = /var/lib/ceph/radosgw/ceph-rgw.storagenode1.rgw0/keyring
log file = /var/log/ceph/ceph-rgw-storagenode1.rgw0.log
rgw frontends = beast endpoint=[abcd:abcd:abcd::21]:8080
rgw thread pool size = 512
--
~ Lokendra
skype: lokendrarathour
Hello,
What's the status with the *-stable-* tags?
https://quay.io/repository/ceph/daemon?tab=tags
No longer build/support?
What should we use until we'll migrate from ceph-ansible to cephadm?
Thanks.
--
Jonas
Hi,
today I did the first update from octopus to pacific, and it looks like the
avg apply latency went up from 1ms to 2ms.
All 36 OSDs are 4TB SSDs and nothing else changed.
Someone knows if this is an issue, or am I just missing a config value?
Cheers
Boris
Hi,
I have setup a ceph cluster with cephadm with docker backend.
I want to move /var/lib/docker to a separate device to get better
performance and less load on the OS device.
I tried that by stopping docker copy the content of /var/lib/docker to
the new device and mount the new device to /var/lib/docker.
The other containers started as expected and continues to work and run
as expected.
But the ceph containers seems to be broken.
I am not able to get them back in working state.
I have tried to remove the host with `ceph orch host rm itcnchn-bb4067`
and readd it but no effect.
The strange thing is that 2 of 4 containers comes up as expected.
ceph orch ps itcnchn-bb4067
NAME HOST STATUS
REFRESHED AGE VERSION IMAGE NAME IMAGE ID
CONTAINER ID
crash.itcnchn-bb4067 itcnchn-bb4067 running (18h) 10m
ago 4w 15.2.7 docker.io/ceph/ceph:v15 2bc420ddb175
2af28c4571cf
mds.cephfs.itcnchn-bb4067.qzoshl itcnchn-bb4067 error 10m
ago 4w <unknown> docker.io/ceph/ceph:v15 <unknown> <unknown>
mon.itcnchn-bb4067 itcnchn-bb4067 error 10m
ago 18h <unknown> docker.io/ceph/ceph:v15 <unknown> <unknown>
rgw.ikea.dc9-1.itcnchn-bb4067.gtqedc itcnchn-bb4067 running (18h) 10m
ago 4w 15.2.7 docker.io/ceph/ceph:v15 2bc420ddb175
00d000aec32b
Docker logs from the active manager does not say much about what is
wrong
debug 2021-01-05T09:57:52.537+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring mds.cephfs.itcnchn-bb4067.qzoshl (unknown last
config time)...
debug 2021-01-05T09:57:52.541+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring daemon mds.cephfs.itcnchn-bb4067.qzoshl on
itcnchn-bb4067
debug 2021-01-05T09:57:52.973+0000 7fdb64e88700 0 log_channel(cluster)
log [DBG] : pgmap v347: 241 pgs: 241 active+clean; 18 GiB data, 50 GiB
used, 52 TiB / 52 TiB avail; 18 KiB/s rd, 78 KiB/s wr, 24 op/s
debug 2021-01-05T09:57:53.085+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring mon.itcnchn-bb4067 (unknown last config
time)...
debug 2021-01-05T09:57:53.085+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring daemon mon.itcnchn-bb4067 on itcnchn-bb4067
debug 2021-01-05T09:57:53.625+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring rgw.ikea.dc9-1.itcnchn-bb4067.gtqedc (unknown
last config time)...
debug 2021-01-05T09:57:53.629+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring daemon rgw.ikea.dc9-1.itcnchn-bb4067.gtqedc on
itcnchn-bb4067
debug 2021-01-05T09:57:54.141+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring crash.itcnchn-bb4067 (unknown last config
time)...
debug 2021-01-05T09:57:54.141+0000 7fdb69691700 0 log_channel(cephadm)
log [INF] : Reconfiguring daemon crash.itcnchn-bb4067 on itcnchn-bb4067
- Karsten
Hi there,
We noticed after creating a signurl that the bucket resources were accessible from IPs that were originally restricted from accessing them (using a bucket policy).
Using the s3cmd utility we confirmed that the Policy is correctly applied and you can access it only for the allowed IPs.
Is this an expected behavior or do we miss something?
Thanks!
ceph version: 17.2.0 on Ubuntu 22.04
non-containerized ceph from Ubuntu repos
cluster started on luminous
I have been using bcache on filestore on rotating disks for many years
without problems. Now converting OSDs to bluestore, there are some
strange effects.
If I create the bcache device, set its rotational flag to '1', then do
ceph-volume lvm create ... --crush-device-class=hdd
the OSD comes up with the right parameters and much improved latency
compared to OSD directly on /dev/sdX.
ceph osd metatdata ...
shows
"bluestore_bdev_type": "hdd",
"rotational": "1"
But after reboot, bcache rotational flag is set '0' again, and the OSD
now comes up with "rotational": "0"
Latency immediately starts to increase (and continually increases over
the next days, possibly due to accumulating fragmention).
These wrong settings stay in place even if I stop the OSD, set the
bcache rotational flag to '1' again and restart the OSD. I have found no
way to get back to the original settings other than destroying and
recreating the OSD. I guess I am just not seeing something obvious, like
from where these settings get pulled at OSD startup.
I even created udev rules to set bcache rotational=1 at boot time,
before any ceph daemon starts, but it did not help. Something running
after these rules reset the bcache rotationl flags back to 0.
Haven't found the culprit yet, but not sure if it even matters.
Are these OSD settings (bluestore_bdev_type, rotational) persisted
somewhere and can they be edited and pinned?
Alternatively, can I manually set and persist the relevant bluestore
tunables (per OSD / per device class) so as to make the bcache
rotational flag irrelevant after the OSD is first created?
Regards
Matthias
On Fri, Apr 08, 2022 at 03:05:38PM +0300, Igor Fedotov wrote:
> Hi Frank,
>
> in fact this parameter impacts OSD behavior at both build-time and during
> regular operationing. It simply substitutes hdd/ssd auto-detection with
> manual specification. And hence relevant config parameters are applied. If
> e.g. min_alloc_size is persistent after OSD creation - it wouldn't be
> updated. But if specific setting allows at run-time - it would be altered.
>
> So the proper usage would definitely be manual ssd/hdd mode selection before
> the first OSD creation and keeping it in that mode along the whole OSD
> lifecycle. But technically one can change the mode at any arbitrary point in
> time which would result in run-rime setting being out-of-sync with creation
> ones. With some unclear side-effects..
>
> Please also note that this setting was orignally intended mostly for
> development/testing purposes not regular usage. Hence it's flexible but
> rather unsafe if used improperly.
>
>
> Thanks,
>
> Igor
>
> On 4/7/2022 2:40 PM, Frank Schilder wrote:
> > Hi Richard and Igor,
> >
> > are these tweaks required at build-time (osd prepare) only or are they required for every restart?
> >
> > Is this setting "bluestore debug enforce settings=hdd" in the ceph config data base or set somewhere else? How does this work if deploying HDD- and SSD-OSDs at the same time?
> >
> > Ideally, all these tweaks should be applicable and settable at creation time only without affecting generic settings (that is, at the ceph-volume command line and not via config side effects). Otherwise it becomes really tedious to manage these.
> >
> > For example, would the following work-flow apply the correct settings *permanently* across restarts:
> >
> > 1) Prepare OSD on fresh HDD with ceph-volume lvm batch --prepare ...
> > 2) Assign dm_cache to logical OSD volume created in step 1
> > 3) Start OSD, restart OSDs, boot server ...
> >
> > I would assume that the HDD settings are burned into the OSD in step 1 and will be used in all future (re-)starts without the need to do anything despite the device being detected as non-rotational after step 2. Is this assumption correct?
> >
> > Thanks and best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Richard Bade <hitrich(a)gmail.com>
> > Sent: 06 April 2022 00:43:48
> > To: Igor Fedotov
> > Cc: Ceph Users
> > Subject: [Warning Possible spam] [ceph-users] Re: Ceph Bluestore tweaks for Bcache
> >
> > Just for completeness for anyone that is following this thread. Igor
> > added that setting in Octopus, so unfortunately I am unable to use it
> > as I am still on Nautilus.
> >
> > Thanks,
> > Rich
> >
> > On Wed, 6 Apr 2022 at 10:01, Richard Bade <hitrich(a)gmail.com> wrote:
> > > Thanks Igor for the tip. I'll see if I can use this to reduce the
> > > number of tweaks I need.
> > >
> > > Rich
> > >
> > > On Tue, 5 Apr 2022 at 21:26, Igor Fedotov <igor.fedotov(a)croit.io> wrote:
> > > > Hi Richard,
> > > >
> > > > just FYI: one can use "bluestore debug enforce settings=hdd" config
> > > > parameter to manually enforce HDD-related settings for a BlueStore
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Igor
> > > >
> > > > On 4/5/2022 1:07 AM, Richard Bade wrote:
> > > > > Hi Everyone,
> > > > > I just wanted to share a discovery I made about running bluestore on
> > > > > top of Bcache in case anyone else is doing this or considering it.
> > > > > We've run Bcache under Filestore for a long time with good results but
> > > > > recently rebuilt all the osds on bluestore. This caused some
> > > > > degradation in performance that I couldn't quite put my finger on.
> > > > > Bluestore osds have some smarts where they detect the disk type.
> > > > > Unfortunately in the case of Bcache it detects as SSD, when in fact
> > > > > the HDD parameters are better suited.
> > > > > I changed the following parameters to match the HDD default values and
> > > > > immediately saw my average osd latency during normal workload drop
> > > > > from 6ms to 2ms. Peak performance didn't change really, but a test
> > > > > machine that I have running a constant iops workload was much more
> > > > > stable as was the average latency.
> > > > > Performance has returned to Filestore or better levels.
> > > > > Here are the parameters.
> > > > >
> > > > > ; Make sure that we use values appropriate for HDD not SSD - Bcache
> > > > > gets detected as SSD
> > > > > bluestore_prefer_deferred_size = 32768
> > > > > bluestore_compression_max_blob_size = 524288
> > > > > bluestore_deferred_batch_ops = 64
> > > > > bluestore_max_blob_size = 524288
> > > > > bluestore_min_alloc_size = 65536
> > > > > bluestore_throttle_cost_per_io = 670000
> > > > >
> > > > > ; Try to improve responsiveness when some disks are fully utilised
> > > > > osd_op_queue = wpq
> > > > > osd_op_queue_cut_off = high
> > > > >
> > > > > Hopefully someone else finds this useful.
> > > > > _______________________________________________
> > > > > ceph-users mailing list -- ceph-users(a)ceph.io
> > > > > To unsubscribe send an email to ceph-users-leave(a)ceph.io
> > > > --
> > > > Igor Fedotov
> > > > Ceph Lead Developer
> > > >
> > > > Looking for help with your Ceph cluster? Contact us at https://croit.io
> > > >
> > > > croit GmbH, Freseniusstr. 31h, 81247 Munich
> > > > CEO: Martin Verges - VAT-ID: DE310638492
> > > > Com. register: Amtsgericht Munich HRB 231263
> > > > Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
> > > >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users(a)ceph.io
> > To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
> --
> Igor Fedotov
> Ceph Lead Developer
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
> Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
Has anybody run into a 'stuck' OSD service specification? I've tried
to delete it, but it's stuck in 'deleting' state, and has been for
quite some time (even prior to upgrade, on 15.2.x). This is on 16.2.3:
NAME PORTS RUNNING REFRESHED AGE PLACEMENT
osd.osd_spec 504/525 <deleting> 12m label:osd
root@ceph01:/# ceph orch rm osd.osd_spec
Removed service osd.osd_spec
From active monitor:
debug 2021-05-06T23:14:48.909+0000 7f17d310b700 0
log_channel(cephadm) log [INF] : Remove service osd.osd_spec
Yet in ls, it's still there, same as above. --export on it:
root@ceph01:/# ceph orch ls osd.osd_spec --export
service_type: osd
service_id: osd_spec
service_name: osd.osd_spec
placement: {}
unmanaged: true
spec:
filter_logic: AND
objectstore: bluestore
We've tried --force, as well, with no luck.
To be clear, the --export even prior to delete looks nothing like the
actual service specification we're using, even after I re-apply it, so
something seems 'bugged'. Here's the OSD specification we're applying:
service_type: osd
service_id: osd_spec
placement:
label: "osd"
data_devices:
rotational: 1
db_devices:
rotational: 0
db_slots: 12
I would appreciate any insight into how to clear this up (without
removing the actual OSDs, we're just wanting to apply the updated
service specification - we used to use host placement rules and are
switching to label-based).
Thanks,
David
Hi,
There is an operation "radosgw-admin bi purge" that removes all bucket
index objects for one bucket in the rados gateway.
What is the undo operation for this?
After this operation the bucket cannot be listed or removed any more.
Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin
http://www.heinlein-support.de
Tel: 030 / 405051-43
Fax: 030 / 405051-19
Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
Hello. We trying to resolve some issue with ceph. Our openshift cluster is blocked and we tried do almost all.
Actual state is:
MDS_ALL_DOWN: 1 filesystem is offline
MDS_DAMAGE: 1 mds daemon damaged
FS_DEGRADED: 1 filesystem is degraded
MON_DISK_LOW: mon be is low on available space
RECENT_CRASH: 1 daemons have recently crashed
We try to perform
cephfs-journal-tool --rank=gml-okd-cephfs:all event recover_dentries summary
cephfs-journal-tool --rank=gml-okd-cephfs:all journal reset
cephfs-table-tool gml-okd-cephfs:all reset session
ceph mds repaired 0
ceph config rm mds mds_verify_scatter
ceph config rm mds mds_debug_scatterstat
ceph tell gml-okd-cephfs scrub start / recursive repair force
After these commands, mds rises but an error appears:
MDS_READ_ONLY: 1 MDSs are read only
We also tried to create new fs with new metadata pool, delete and recreate old fs with same name with old\new metadatapool.
We got rid of the errors, but the Openshift cluster did not want to work with the old persistence volumes. The pods wrote an error that they could not find it, while it was present and moreover, this volume was associated with pvc.
Now we have rolled back the cluster and are trying to remove the mds error. Any ideas what to try?
Thanks