Hey all,
I'm creating a new post for this issue as we've narrowed the problem
down to a partsize limitation on multipart upload. We have discovered
that in our production Nautilus (14.2.11) cluster and our lab Nautilus
(14.2.10) cluster that multipart uploads with a configured part size
of greater than 16777216 bytes (16MiB) will return a status 500 /
internal server error from radosgw.
So far I have increased the following rgw settings/values that looked
suspect, without any success/improvement with partsizes.
Such as:
"rgw_get_obj_window_size": "16777216",
"rgw_put_obj_min_window_size": "16777216",
I am trying to determine if this is because of a conservative default
setting somewhere that I don't know about or if this is perhaps a bug?
I would appreciate it if someone on Nautilus with rgw could also test
/ provide feedback. It's very easy to reproduce and configuring your
partsize with aws2cli requires you to put the following in your aws
'config'
s3 =
multipart_chunksize = 32MB
rgw server logs during a failed multipart upload (32MB chunk/partsize):
2020-09-08 15:59:36.054 7f2d32fa6700 1 ====== starting new request
req=0x55953dc36930 =====
2020-09-08 15:59:36.082 7f2d32fa6700 -1 res_query() failed
2020-09-08 15:59:36.138 7f2d32fa6700 1 ====== req done
req=0x55953dc36930 op status=0 http_status=200 latency=0.0839988s
======
2020-09-08 16:00:07.285 7f2d3dfbc700 1 ====== starting new request
req=0x55953dc36930 =====
2020-09-08 16:00:07.285 7f2d3dfbc700 -1 res_query() failed
2020-09-08 16:00:07.353 7f2d00741700 1 ====== starting new request
req=0x55954dd5e930 =====
2020-09-08 16:00:07.357 7f2d00741700 -1 res_query() failed
2020-09-08 16:00:07.413 7f2cc56cb700 1 ====== starting new request
req=0x55953dc02930 =====
2020-09-08 16:00:07.417 7f2cc56cb700 -1 res_query() failed
2020-09-08 16:00:07.473 7f2cb26a5700 1 ====== starting new request
req=0x5595426f6930 =====
2020-09-08 16:00:07.473 7f2cb26a5700 -1 res_query() failed
2020-09-08 16:00:09.465 7f2d3dfbc700 0 WARNING: set_req_state_err
err_no=35 resorting to 500
2020-09-08 16:00:09.465 7f2d3dfbc700 1 ====== req done
req=0x55953dc36930 op status=-35 http_status=500 latency=2.17997s
======
2020-09-08 16:00:09.549 7f2d00741700 0 WARNING: set_req_state_err
err_no=35 resorting to 500
2020-09-08 16:00:09.549 7f2d00741700 1 ====== req done
req=0x55954dd5e930 op status=-35 http_status=500 latency=2.19597s
======
2020-09-08 16:00:09.605 7f2cc56cb700 0 WARNING: set_req_state_err
err_no=35 resorting to 500
2020-09-08 16:00:09.609 7f2cc56cb700 1 ====== req done
req=0x55953dc02930 op status=-35 http_status=500 latency=2.19597s
======
2020-09-08 16:00:09.641 7f2cb26a5700 0 WARNING: set_req_state_err
err_no=35 resorting to 500
2020-09-08 16:00:09.641 7f2cb26a5700 1 ====== req done
req=0x5595426f6930 op status=-35 http_status=500 latency=2.16797s
======
awscli client side output during a failed multipart upload:
root@jump:~# aws --no-verify-ssl --endpoint-url
http://lab-object.cancercollaboratory.org:7480 s3 cp 4GBfile
s3://troubleshooting
upload failed: ./4GBfile to s3://troubleshooting/4GBfile An error
occurred (UnknownError) when calling the UploadPart operation (reached
max retries: 2): Unknown
Thanks,
Jared Baker
Cloud Architect for the Cancer Genome Collaboratory
Ontario Institute for Cancer Research
Hi *,
I was just testing rbd-mirror on ceph version 15.2.4-864-g0f510cb110
(0f510cb1101879a5941dfa1fa824bf97db6c3d08) octopus (stable) and
noticed mgr errors on the primary site (also in version 15.2.2):
---snip---
2020-09-10T11:20:01.724+0200 7f1c1b46a700 0 [dashboard ERROR
controllers.rbd_mirror] Failed to list mirror image status rbd-pool1
Traceback (most recent call last):
File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py",
line 259, in _get_pool_datum
for image in mirror_image_status
File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py",
line 259, in <listcomp>
for image in mirror_image_status
KeyError: 'description'
2020-09-10T11:20:01.724+0200 7f1c1b46a700 0 [dashboard ERROR
viewcache] Error while calling fn=<function _get_pool_datum at
0x7f1c5ac58f28> ex='description'
Traceback (most recent call last):
File "/usr/share/ceph/mgr/dashboard/tools.py", line 156, in run
val = self.fn(*self.args, **self.kwargs)
File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py",
line 259, in _get_pool_datum
for image in mirror_image_status
File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py",
line 259, in <listcomp>
for image in mirror_image_status
KeyError: 'description'
2020-09-10T11:20:01.724+0200 7f1c1ec71700 0 [dashboard ERROR
viewcache] Error while calling fn=<function _get_content_data at
0x7f1c5ac610d0> ex='description'
Traceback (most recent call last):
File "/usr/share/ceph/mgr/dashboard/tools.py", line 156, in run
val = self.fn(*self.args, **self.kwargs)
File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py",
line 283, in _get_content_data
_, pool = _get_pool_datum(pool_name)
File "/usr/share/ceph/mgr/dashboard/tools.py", line 254, in wrapper
return rvc.run(fn, args, kwargs)
File "/usr/share/ceph/mgr/dashboard/tools.py", line 236, in run
raise self.exception
File "/usr/share/ceph/mgr/dashboard/tools.py", line 156, in run
val = self.fn(*self.args, **self.kwargs)
File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py",
line 259, in _get_pool_datum
for image in mirror_image_status
File "/usr/share/ceph/mgr/dashboard/controllers/rbd_mirroring.py",
line 259, in <listcomp>
for image in mirror_image_status
KeyError: 'description'
2020-09-10T11:20:01.724+0200 7f1c23ebb700 0 [dashboard ERROR request]
[X.X.X.X:42050] [GET] [500] [0.020s] [admin] [513.0B]
/api/block/mirroring/summary
2020-09-10T11:20:01.728+0200 7f1c23ebb700 0 [dashboard ERROR request]
[b'{"status": "500 Internal Server Error", "detail": "The server
encountered an unexpected condition which prevented it from fulfilling
the request.", "request_id": "64ea5407-171b-4f6c-862e-fc881e68fec0"}
---snip---
The mirroring itself seems to work fine, I haven't noticed issues with
the image on the secondary site. I also uploaded a screenshot of the
dashboard [1], the installation is on openSUSE Leap 15.2.
I couldn't find a tracker issue or a report in bugzilla.opensuse.org,
but before creating a ticket I wanted to ask if this is a known issue.
The rbd-mirror panel works fine on the secondary site, I assume
because there is actually an rbd-mirror daemon running. Or maybe I
just misunderstand the purpose, but if there is no mirror daemon on
site A there shouldn't be an error message, should it?
Regards,
Eugen
[1] https://paste.opensuse.org/36442780
Hi *,
I'm currently testing rbd-mirror on ceph version
15.2.4-864-g0f510cb110 (0f510cb1101879a5941dfa1fa824bf97db6c3d08)
octopus (stable) and saw this during an rbd import of a fresh image on
the primary site:
---snip---
ceph1:~ # rbd import /mnt/SUSE-OPENSTACK-CLOUD-7-x86_64-GM-DVD1.iso
rbd-pool1/cloud7
Importing image: 5% complete...2020-09-10T12:00:10.220+0200
7ff1b9ffb700 -1 librbd::SnapshotRemoveRequest: 0x7ff19c022060 send_op:
snapshot doesn't exist
2020-09-10T12:00:10.224+0200 7ff1b9ffb700 -1
librbd::SnapshotRemoveRequest: 0x7ff19c022060 should_complete:
encountered error: (2) No such file or directory
2020-09-10T12:00:10.232+0200 7ff1ba7fc700 -1 librbd::Operations: No
such snapshot found.
Importing image: 6% complete...2020-09-10T12:00:13.940+0200
7ff1b9ffb700 -1 librbd::SnapshotRemoveRequest: 0x7ff19c032eb0 send_op:
snapshot doesn't exist
2020-09-10T12:00:13.940+0200 7ff1b9ffb700 -1
librbd::SnapshotRemoveRequest: 0x7ff19c032eb0 should_complete:
encountered error: (2) No such file or directory
Importing image: 7% complete...2020-09-10T12:00:18.596+0200
7ff1b9ffb700 -1 librbd::SnapshotRemoveRequest: 0x7ff19c027af0 send_op:
snapshot doesn't exist
2020-09-10T12:00:18.596+0200 7ff1b9ffb700 -1
librbd::SnapshotRemoveRequest: 0x7ff19c027af0 should_complete:
encountered error: (2) No such file or directory
Importing image: 8% complete...2020-09-10T12:00:24.048+0200
7ff1b9ffb700 -1 librbd::SnapshotRemoveRequest: 0x7ff19c02f940 send_op:
snapshot doesn't exist
2020-09-10T12:00:24.048+0200 7ff1b9ffb700 -1
librbd::SnapshotRemoveRequest: 0x7ff19c02f940 should_complete:
encountered error: (2) No such file or directory
Importing image: 100% complete...done.
---snip---
The import eventually succeeds and the image is fine (md5sum on the
remote image is correct), but I'm wondering what snapshot it is
looking for since there aren't any (yet). Here's some info about the
image (not sure if it's relevant):
---snip---
ceph1:~ # rbd info rbd-pool1/cloud7
rbd image 'cloud7':
size 1.4 GiB in 347 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 149e4eb8e387a
block_name_prefix: rbd_data.149e4eb8e387a
format: 2
features: layering, exclusive-lock, object-map, fast-diff,
deep-flatten, journaling
op_features:
flags:
create_timestamp: Thu Sep 10 11:59:18 2020
access_timestamp: Thu Sep 10 11:59:18 2020
modify_timestamp: Thu Sep 10 12:03:30 2020
journal: 149e4eb8e387a
mirroring state: enabled
mirroring mode: journal
mirroring global id: 82bb30ec-bd33-4f84-9d0c-b3650b567670
mirroring primary: true
---snip---
Is this worth a bug report or is it already known and I just couldn't
find anything?
Regards,
Eugen
Hi,
I haven't done this myself yet but you should be able to simply move
the (virtual) disk to the new host and start the OSD, depending on the
actual setup. If those are stand-alone OSDs (no separate DB/WAL) it
shouldn't be too difficult [1]. If you're using ceph-volume you could
run 'ceph-volume lvm trigger' in case the OSD doesn't restart.
Although this would not reconstruct data it would start a rebalance
process since the crushmap will change automatically, leading to
misplaced objects.
Regards,
Eugen
[1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-March/025624.html
Zitat von huxiaoyu(a)horebdata.cn:
> Dear ceph folks,
>
> I encoutered an interesting situation as follows: an old FC SAN is
> connected two ceph OSD nodes, and its LUNs are used as virtual OSDs.
> When one node fails, its LUN can be taken over by anther node. My
> question is, how to start up the OSD on the new node without
> reconstructing its data? In other words, is there a simple to move
> OSD from one node to another?
>
> many thanks,
>
> samuel
>
>
>
>
>
> huxiaoyu(a)horebdata.cn
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
Dear ceph folks,
I encoutered an interesting situation as follows: an old FC SAN is connected two ceph OSD nodes, and its LUNs are used as virtual OSDs. When one node fails, its LUN can be taken over by anther node. My question is, how to start up the OSD on the new node without reconstructing its data? In other words, is there a simple to move OSD from one node to another?
many thanks,
samuel
huxiaoyu(a)horebdata.cn
Hello,
I've been trying to understand if there is any way to get usage information based on storage classes for buckets.
Since there is no information available from the "radosgw-admin bucket stats" commands nor any other endpoint I
tried to browse the source code but couldn't find any references where the storage class would be exposed in such a way.
It also seems that RadosGW today is not saving any counters on amount of objects stored in storage classes when it's
collecting usage stats, which means there is no such metadata saved for a bucket.
I was hoping it was atleast saved but not exposed because then it would have been a easier fix than adding support to count number of objects in storage classes based on operations which would involve a lot of places and mean writing to the bucket metadata on each op :(
Is my assumptions correct that there is no way to retrieve such information, meaning there is no way to measure such usage?
If the answer is yes, I assume the only way to get something that could be measured would be to instead have multiple placement
targets since that is exposed from in bucket info. The bad things would be though that you lose a lot of functionality related to lifecycle
and moving a single object to another storage class.
Best regards
Tobias
Hi,
I recently added 3 new servers to Ceph cluster. These servers use the H740p mini raid card and I had to install the HWE kernel in Ubuntu 16.04 in order to get the drives recognized.
We have a 23 node cluster and normally when we add OSDs they end up mounting like this:
/dev/sde1 3.7T 2.0T 1.8T 54% /var/lib/ceph/osd/ceph-15
/dev/sdj1 3.7T 2.0T 1.7T 55% /var/lib/ceph/osd/ceph-20
/dev/sdd1 3.7T 2.1T 1.6T 58% /var/lib/ceph/osd/ceph-14
/dev/sdc1 3.7T 1.8T 1.9T 49% /var/lib/ceph/osd/ceph-13
However I noticed this morning that the 3 new servers have the OSDs mounted like this:
tmpfs 47G 28K 47G 1% /var/lib/ceph/osd/ceph-246
tmpfs 47G 28K 47G 1% /var/lib/ceph/osd/ceph-240
tmpfs 47G 28K 47G 1% /var/lib/ceph/osd/ceph-248
tmpfs 47G 28K 47G 1% /var/lib/ceph/osd/ceph-237
Is this normal for deployments going forward…or did something go wrong? These are 12TB drives but they are showing up as 47G here instead.
We are using ceph version 12.2.13 and I installed this using ceph-deply version 2.0.1.
Thanks in advance,
Shain
Shain Miley | Director of Platform and Infrastructure | Digital Media | smiley(a)npr.org
Hi there, I got a handicap problem on ceph orch which cannot found anywhere from doc to solve this problem, I was removed an osd using ceph purge osd.27 (old style) and forgotten there are new way in Octopus. Now i got two osd process via ceph orch ps, is it have a way to clean it up properly?
osd.27 sds-0 running (7h) 49s ago 7h 15.2.4 docker.io/ceph/ceph:v15 852b28cb10de acab3229c945
osd.27 sds-2 stopped 19s ago 4d <unknown> docker.io/ceph/ceph:v15 <unknown> <unknown>
Dear Ceph Users,
I am testing my 3 node Proxmox + Ceph cluster.
I have performed osd benchmark with the below command.
# ceph tell osd.0 bench
Do I need to perform any cleanup to delete benchmark data from osd ?
I have googled for same but nowhere mentioned post steps after osd
benchmark commands
Thanks
Jayesh
Hi *,
I'm wondering about what actually happens in the ceph cluster if I
copy/sync the content of one bucket into a different bucket. I'll just
describe what I saw and maybe someone could clarify what is happening.
I have a RGW in a small test cluster (15.2.2) and created a bucket
(bucket1) with s3cmd, then put a large file into bucket1. This takes
some time, of course, I see the network utilization on the client and
also the OSD load on the cluster during the upload (expected).
Then I create bucket2 and run 's3cmd cp s3://bucket1/file
s3://bucket2' which takes only a couple of seconds. I only see some
OSD load for a short period of time during the copy process, but
that's it. The copied file is available almost immediately.
How does this work? It seems as if there's (almost) no client traffic
(except for the cp command, of course) to recreate the file in the
second bucket, as if the OSDs are directly instructed to create copies
of the objects.
I would highly appreciate it if anyone could clarify this for me.
Thanks!
Eugen