Hi,all:
the slave zone show metadata is caught up with master ; but use radosgw-admin bucket list|wc diff master and the slave zone , is not equal.
how can I force sync it?
Hello Ceph users,
We are experiencing an issue with ceph 14.2.9 / RGW Beast frontend. We are seeing this across our two separate clusters.
Over a few weeks the qlen and qactive are going up and not returning to zero. At some point we start seeing performance degrade and we need to reboot the services. We are viewing the queue numbers in the perfcounters_dump. In objecter_requests we aren't seeing any request ( apart from very briefly )
We can reproduce the issue by use S3 browser and setting the concurrent downloads to 100. After completing download of ~1000 files, the queue length has incremented by 2-5 and never returns back to zero. Subsequent bulk downloads increase the qlen.
We have the following tunables set
rgw_bucket_index_max_aio 128
rgw_dns_name <fqdn##>
rgw_frontends beast ssl_port=443 ssl_certificate=<CERT##>
rgw_max_chunk_size 4194304
rgw_num_rados_handles 16
rgw_thread_pool_size 500
Anyone seen this or have any idea how to further debug?
Any additional tuning suggested? 350TB S3 data
Glen
This e-mail is intended solely for the benefit of the addressee(s) and any other named recipient. It is confidential and may contain legally privileged or confidential information. If you are not the recipient, any use, distribution, disclosure or copying of this e-mail is prohibited. The confidentiality and legal privilege attached to this communication is not waived or lost by reason of the mistaken transmission or delivery to you. If you have received this e-mail in error, please notify us immediately.
Hello List,
first of all: Yes - i made mistakes. Now i am trying to recover :-/
I had a healthy 3 node cluster which i wanted to convert to a single one.
My goal was to reinstall a fresh 3 Node cluster and start with 2 nodes.
I was able to healthy turn it from a 3 Node Cluster to a 2 Node cluster.
Then the problems began.
I started to change size=1 and min_size=1.
Health was okay until here. Then over sudden both nodes got
fenced...one node refused to boot, mons where missing, etc...to make
long story short, here is where i am right now:
root@node03:~ # ceph -s
cluster b3be313f-d0ef-42d5-80c8-6b41380a47e3
health HEALTH_WARN
53 pgs stale
53 pgs stuck stale
monmap e4: 2 mons at {0=10.15.15.3:6789/0,1=10.15.15.2:6789/0}
election epoch 298, quorum 0,1 1,0
osdmap e6097: 14 osds: 9 up, 9 in
pgmap v93644673: 512 pgs, 1 pools, 1193 GB data, 304 kobjects
1088 GB used, 32277 GB / 33366 GB avail
459 active+clean
53 stale+active+clean
root@node03:~ # ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 32.56990 root default
-2 25.35992 host node03
0 3.57999 osd.0 up 1.00000 1.00000
5 3.62999 osd.5 up 1.00000 1.00000
6 3.62999 osd.6 up 1.00000 1.00000
7 3.62999 osd.7 up 1.00000 1.00000
8 3.62999 osd.8 up 1.00000 1.00000
19 3.62999 osd.19 up 1.00000 1.00000
20 3.62999 osd.20 up 1.00000 1.00000
-3 7.20998 host node02
3 3.62999 osd.3 up 1.00000 1.00000
4 3.57999 osd.4 up 1.00000 1.00000
1 0 osd.1 down 0 1.00000
9 0 osd.9 down 0 1.00000
10 0 osd.10 down 0 1.00000
17 0 osd.17 down 0 1.00000
18 0 osd.18 down 0 1.00000
my main mistakes seemd to be:
--------------------------------
ceph osd out osd.1
ceph auth del osd.1
systemctl stop ceph-osd@1
ceph osd rm 1
umount /var/lib/ceph/osd/ceph-1
ceph osd crush remove osd.1
As far as i can tell, ceph waits and needs data from that OSD.1 (which
i removed)
root@node03:~ # ceph health detail
HEALTH_WARN 53 pgs stale; 53 pgs stuck stale
pg 0.1a6 is stuck stale for 5086.552795, current state
stale+active+clean, last acting [1]
pg 0.142 is stuck stale for 5086.552784, current state
stale+active+clean, last acting [1]
pg 0.1e is stuck stale for 5086.552820, current state
stale+active+clean, last acting [1]
pg 0.e0 is stuck stale for 5086.552855, current state
stale+active+clean, last acting [1]
pg 0.1d is stuck stale for 5086.552822, current state
stale+active+clean, last acting [1]
pg 0.13c is stuck stale for 5086.552791, current state
stale+active+clean, last acting [1]
[...] SNIP [...]
pg 0.e9 is stuck stale for 5086.552955, current state
stale+active+clean, last acting [1]
pg 0.87 is stuck stale for 5086.552939, current state
stale+active+clean, last acting [1]
When i try to start ODS.1 manually, i get:
--------------------------------------------
2020-02-10 18:48:26.107444 7f9ce31dd880 0 ceph version 0.94.10
(b1e0532418e4631af01acbc0cedd426f1905f4af), process ceph-osd, pid
10210
2020-02-10 18:48:26.134417 7f9ce31dd880 0
filestore(/var/lib/ceph/osd/ceph-1) backend xfs (magic 0x58465342)
2020-02-10 18:48:26.184202 7f9ce31dd880 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
FIEMAP ioctl is supported and appears to work
2020-02-10 18:48:26.184209 7f9ce31dd880 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
FIEMAP ioctl is disabled via 'filestore fiemap' config option
2020-02-10 18:48:26.184526 7f9ce31dd880 0
genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
syncfs(2) syscall fully supported (by glibc and kernel)
2020-02-10 18:48:26.184585 7f9ce31dd880 0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: extsize
is disabled by conf
2020-02-10 18:48:26.309755 7f9ce31dd880 0
filestore(/var/lib/ceph/osd/ceph-1) mount: enabling WRITEAHEAD journal
mode: checkpoint is not enabled
2020-02-10 18:48:26.633926 7f9ce31dd880 1 journal _open
/var/lib/ceph/osd/ceph-1/journal fd 20: 5367660544 bytes, block size
4096 bytes, directio = 1, aio = 1
2020-02-10 18:48:26.642185 7f9ce31dd880 1 journal _open
/var/lib/ceph/osd/ceph-1/journal fd 20: 5367660544 bytes, block size
4096 bytes, directio = 1, aio = 1
2020-02-10 18:48:26.664273 7f9ce31dd880 0 <cls>
cls/hello/cls_hello.cc:271: loading cls_hello
2020-02-10 18:48:26.732154 7f9ce31dd880 0 osd.1 6002 crush map has
features 1107558400, adjusting msgr requires for clients
2020-02-10 18:48:26.732163 7f9ce31dd880 0 osd.1 6002 crush map has
features 1107558400 was 8705, adjusting msgr requires for mons
2020-02-10 18:48:26.732167 7f9ce31dd880 0 osd.1 6002 crush map has
features 1107558400, adjusting msgr requires for osds
2020-02-10 18:48:26.732179 7f9ce31dd880 0 osd.1 6002 load_pgs
2020-02-10 18:48:31.939810 7f9ce31dd880 0 osd.1 6002 load_pgs opened 53 pgs
2020-02-10 18:48:31.940546 7f9ce31dd880 -1 osd.1 6002 log_to_monitors
{default=true}
2020-02-10 18:48:31.942471 7f9ce31dd880 1 journal close
/var/lib/ceph/osd/ceph-1/journal
2020-02-10 18:48:31.969205 7f9ce31dd880 -1 ESC[0;31m ** ERROR: osd
init failed: (1) Operation not permittedESC[0m
Its mounted:
/dev/sdg1 3.7T 127G 3.6T 4% /var/lib/ceph/osd/ceph-1
Is there any way i can get the OSD.1 back in?
Thanks a lot,
mario
I have a problem with one osd (osd.5 on server lod) that keeps crashing.
Often it immediately crashes on restart, but oddly a server reboot fixes
that, also it alwats starts ok from the command line. Service status and
journalctl don't show any useful information.
There's two osd's on the server, the other osd never has a problem.
Server
* osd services only
* 8GB Ram
* Nautilus 14.2.9
* osd.5 : 1TB - crashes
* osd.12 : 500GB - Fine
So I ran it from the command line and copied the console dump when it
crashed. Any thoughts? should I create a bug report for it?
--
Lindsay
Hi all,
I have a 39 node, 1404 spinning disk Ceph Mimic cluster across 6 racks
for a total of 9.1PiB raw and about 40% utilized. These storage nodes
started their life on Ubuntu 14.04 and in-place upgraded to 16.04 2
years ago however I have started a project to do fresh installs of
each OSD node to Ubuntu 18.04 to keep things fresh and well supported.
I am reaching out to see what others might suggest in terms of
strategy to get these hosts updated quicker than my current strategy.
Current strategy:
1. Pick 3 nodes, drain them by modifying the crush weight
2. Fresh install 18.04 using automation tool (MAAS) + some Ansible
playbooks to setup server
3. Purge OSD node worth of OSD' (this causes data to be 'misplaced'
due to rack weight changing)
4. Run ceph-volume lvm batch for osd node
5. Move OSD's in to desired hosts in crush map (large rebalancing to
fill back up)
If anyone has suggestions on a quicker way to do this I am all ears.
I am wondering if its not necessary to have to drain/fill OSD nodes at
all and if this can be done with just a fresh install and not touch
the OSD's however I don't know how to perform a fresh installation and
then tell ceph that I have OSD's with data on them and to somehow
re-register them with the cluster? Or is there a better order of
operations to draining/filling without causing a high amount of
objects to be misplaced due to manipulating the crush map.
That being said, since our cluster is a bit older and the majority of
our bluestore osd's are provisioned in the 'simple' method using a
small metadata partition and the remainder as a raw partition whereas
now it seems the suggested way is to use the lvm layout and tmpfs.
Anyways, I'm all ears and appreciate any feedback.
Jared Baker
Ontario Institute for Cancer Research
Hi. I found a default rocksdb option in bluestore that I can't find in
facebook rocksdb.
recycle_log_file_num this config if a boolean config in facebook
rocksdb but in default Ceph configs the value of this is 4.
Can someone tell what it means?