Hi,everyone
Our production ceph cluster the version is hammer have 3 monitors and 300+ osds.Monitor daemon runs on host diffrent from osd daemon.
openstack vm runs on rbd.Now we need to maintain monitor host so must shutdown monitor host a moment, one by one.
But keep two monitor host online all the time.
What is the impact of monitor down on vm?
How to do reduce the impact?
Dear Cepher,
On a Ceph node (Luminous 12.2.13), i had a HDD with smartctl inidicating going to fail soon but still operational, and i would like to replace it now. It is no fun to rebalance the data and then get a new disk in and do rebalanance again.
I am thinging of using ceph-objectstore-tool to copy the data from the failing HDD to a new HDD on the same ceph node, with the same OSD ID. Is it possible, and how?
thanks in advance for advice,
samuel
huxiaoyu(a)horebdata.cn
Hi everyone, we have a ceph cluster for object storage only, the rgws are accessible from the internet, and everything is ok.
Now, one of our team/client required that their data should not ever be accessible from the internet.
In any case of security bug/breach/whatever, they want to limit the access to their data from the local network.
Before creating a second "private" cluster, is there a way to achieve this on our current "public" cluster?
Is a multi-zone without replication would help me with that?
A public rgws for public access on the "pub_zone", and a private rgws for private access on the "prv_zone"?
pubzone.rgw.buckets.data
prvzone.rgw.buckets.data
If the "public" rgws is hacked, without the access_key/secret_key of the private zone, is there any possibilities to access the private zone?
Does a multi-realms would help me to secure it more?
Any input would be really appreciated.
I don't want to put to much energy for false security and/or security by obscurity,
so if these scenarios of multi-sites/multi-realms are useless, in a security point of view, please tell me. :-)
Thanks!
JS
Hi everyone,
Our Ceph cluster is stuck in syncing status for a long time after executing the radosgw-admin data sync init command.
-----
realm dcd64504-c445-4810-9b83-851875443bcd (storage)
zonegroup 313a345a-4886-4cb3-8d06-0fe3919d591a (mastergroup)
zone 76fc5fe2-9f89-4419-b611-ab275000b358 (dc01)
metadata sync no sync (zone is master)
data sync source: cc4e8e55-988a-430e-b1df-4d88f0c81f4f (dc02)
syncing
full sync: 117/128 shards
full sync: 3 buckets to sync
incremental sync: 11/128 shards
data is behind on 117 shards
behind shards: [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127]
-----
Can anyone give me a hint to let the synchronization job finish?
Many thanks!
--
Nghia Viet Tran (Mr)
mgm technology partners Vietnam Co. Ltd
7 Phan Châu Trinh
Đà Nẵng, Vietnam
+84 935905659
nghia.viet.tran(a)mgm-tp.com<mailto:nghia.viet.tran@mgm-tp.com>
www.mgm-tp.com<https://www.mgm-tp.com/en/>
Visit us on LinkedIn<https://www.linkedin.com/company/mgm-technology-partners-vietnam-co-ltd> and Facebook<https://www.facebook.com/mgmTechnologyPartnersVietnam>!
Innovation Implemented.
General Director: Frank Müller
Registered office: 7 Pasteur, Hải Châu 1, Hải Châu, Đà Nẵng
MST/Tax 0401703955
Hello all,
is it allowed to configure and activate a cache tier for a pool that contains used RBD images?
The documentation (https://docs.ceph.com/docs/mimic/rados/operations/cache-tiering/) doesn't say anything about this, but we have experienced errors with our VMS consuming ceph rbd volumes.
Is there any other preparation measure before we would issue the set-overlay command?
Kind regards,
Laszlo
Hi,
I got involved in a case where a Nautilus cluster was experiencing MDSes
asserting showing the backtrace mentioned in this ticket:
https://tracker.ceph.com/issues/36349
ceph_assert(follows >= realm->get_newest_seq());
In the end we needed to use these tooling to get one MDS running again:
https://docs.ceph.com/docs/master/cephfs/disaster-recovery-experts/#using-a…
The root-cause seems to be that this Nautilus cluster was running
Multi-MDS with a very high amount of CephFS snapshots.
After a couple of days of scanning (scan_links seems single threaded!)
we finally got a single MDS running again with a usable CephFS filesystem.
At the moment chowns() are running to get all the permissions set back
to what they should be.
The question now outstanding: Is it safe to enable Multi-MDS again on a
CephFS filesystem which still has these many snapshots and is running
single at the moment?
New snapshots are disabled at the moment, so those won't be created.
In addition: How safe is it to remove snapshots? As this will result in
metadata updates.
Thanks
Wido
Hello.
I have a relatively new ceph installation that is only running ceph FS at the moment. We are seeing intermittent issues where "ceph -s" is reporting "MDS report slow requests" and sometimes the MDS crash and take a while to recover/replay or we have to manually restart an mds service to get the state back to HEALTH_OK.
Is there any documentation for recommended configuration?
Here is our cluster setup:
35 total nodes, 88 cores, 512GB ram, 100Gb network
2 ceph fs data pools, 1 is all ssd, the other is nvme
3 active MDS, 1 pinned to the nvme pool/dir, 1 pinned to another large directory, and the third has no pinning
2 standby MDS
ceph config dump:
mds advanced mds_beacon_grace 60.000000
mds basic mds_cache_memory_limit 68719476736
mds advanced mds_cache_trim_threshold 65536
mds advanced mds_recall_max_decay_rate 2.000000
Please let me know if more info is required.
Thanks!
This is to announce the retirement of v13.2.X Mimic stable release
series, and there will no longer be any more backport releases to the
Mimic series. Any more patches to the mimic branch will have to be
tested by the developer submitting the patches and approved by the tech
lead of the respective component before merge to keep the branch stable.
The last release of Mimic was v13.2.10 released on Apr 2020. This is
keeping up with the active 2 stable releases and 24 month support cycle,
which is documented at
https://docs.ceph.com/docs/master/releases/general/#lifetime-of-stable-rele…
Users are requested to upgrade to Nautilus or Octopus.
For the official blog post link please refer to
https://ceph.io/releases/mimic-is-retired/
--
Abhishek Lekshmanan
SUSE Software Solutions Germany GmbH
GF: Felix Imendörffer, HRB 36809 (AG Nürnberg)