Hello fellow Ceph users,
we have released our new Ceph benchmark paper [0]. The used platform and
Hardware is Proxmox VE 6.2 with Ceph Octopus on a new AMD Epyc Zen2 CPU
with U.2 SSDs (details in the paper).
The paper should illustrate the performance that is possible with a 3x
node cluster without significant tuning.
I welcome everyone to share their experience and add to the discussion,
perferred on our forum [1] thread with our fellow Proxmox VE users.
--
Cheers,
Alwin
[0] https://proxmox.com/en/downloads/item/proxmox-ve-ceph-benchmark-2020-09
[1] https://forum.proxmox.com/threads/proxmox-ve-ceph-benchmark-2020-09-hyper-c…
Hi,
I have been setting up a new cluster with a combination of cephadm and
ceph orch.
I have run in to a problem with rgw daemons that does not start.
I have been following the documentation:
https://docs.ceph.com/en/latest/cephadm/install/ - the RGW section
ceph orch apply rgw ikea cn-dc9-1 --port=8080 --placement="3
itcnchn-bb4065 itcnchn-bb4066 itcnchn-bb4067"
Then wait for everything to settle and then rgw's was expected to start
on nodes itcnchn-bb4065 itcnchn-bb4066 itcnchn-bb4067 but that was not
the case.
ceph orch ps itcnchn-bb4065
NAME HOST STATUS
REFRESHED AGE VERSION IMAGE NAME IMAGE ID
CONTAINER ID
alertmanager.itcnchn-bb4065 itcnchn-bb4065 running (3w)
2m ago 4w 0.20.0 prom/alertmanager:v0.20.0 0881eb8f169f
c595826b8867
crash.itcnchn-bb4065 itcnchn-bb4065 running (4w)
2m ago 4w 15.2.4 docker.io/ceph/ceph:v15 852b28cb10de
4598d37bf91c
grafana.itcnchn-bb4065 itcnchn-bb4065 running (4w)
2m ago 4w 6.6.2 ceph/ceph-grafana:latest 87a51ecf0b1c
ad8e16d12ef8
mgr.itcnchn-bb4065.jxtcer itcnchn-bb4065 running (4w)
2m ago 4w 15.2.4 docker.io/ceph/ceph:v15 852b28cb10de
d9278a7e772c
mon.itcnchn-bb4065 itcnchn-bb4065 running (4w)
2m ago 4w 15.2.4 docker.io/ceph/ceph:v15 852b28cb10de
e0edd4a9038a
prometheus.itcnchn-bb4065 itcnchn-bb4065 running (3w)
2m ago 4w 2.18.1 prom/prometheus:v2.18.1 de242295e225
1103b62e191b
rgw.ikea.cn-dc9-1.itcnchn-bb4065.xpwict itcnchn-bb4065 error
2m ago 22h <unknown> docker.io/ceph/ceph:v15 <unknown>
<unknown>
I have logged into itcnchn-bb4065 and tried to start the rgw container
my self, the command from
(/var/lib/ceph/0aba4c3a-f735-11ea-af33-02423ee04865/rgw.ikea.cn-dc9-1.itcnchn-bb4065.xpwict/unit.run)
/usr/bin/docker run --rm --net=host --ipc=host --name
ceph-0aba4c3a-f735-11ea-af33-02423ee04865-rgw.ikea.cn-dc9-1.itcnchn-bb4065.xpwict
-e CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=itcnchn-bb4065
-v /var/run/ceph/0aba4c3a-f735-11ea-af33-02423ee04865:/var/run/ceph:z -v
/var/log/ceph/0aba4c3a-f735-11ea-af33-02423ee04865:/var/log/ceph:z -v
/var/lib/ceph/0aba4c3a-f735-11ea-af33-02423ee04865/crash:/var/lib/ceph/crash:z
-v
/var/lib/ceph/0aba4c3a-f735-11ea-af33-02423ee04865/rgw.ikea.cn-dc9-1.itcnchn-bb4065.xpwict:/var/lib/ceph/radosgw/ceph-rgw.ikea.cn-dc9-1.itcnchn-bb4065.xpwict:z
-v
/var/lib/ceph/0aba4c3a-f735-11ea-af33-02423ee04865/rgw.ikea.cn-dc9-1.itcnchn-bb4065.xpwict/config:/etc/ceph/ceph.conf:z
--entrypoint /usr/bin/radosgw docker.io/ceph/ceph:v15 -n
client.rgw.ikea.cn-dc9-1.itcnchn-bb4065.xpwict -f --setuser ceph
--setgroup ceph --default-log-to-file=false --default-log-to-stderr=true
--default-log-stderr-prefix="debug "
debug 2020-10-14T06:55:54.835+0000 7f87075b1280 0 deferred set uid:gid
to 167:167 (ceph:ceph)
debug 2020-10-14T06:55:54.835+0000 7f87075b1280 0 ceph version 15.2.4
(7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable), process
radosgw, pid 1
debug 2020-10-14T06:55:54.835+0000 7f87075b1280 0 framework: beast
debug 2020-10-14T06:55:54.835+0000 7f87075b1280 0 framework conf key:
port, val: 8080
debug 2020-10-14T06:55:54.835+0000 7f87075b1280 1 radosgw_Main not
setting numa affinity
debug 2020-10-14T06:55:54.875+0000 7f87075b1280 1 Cannot find zone id=
(name=cn-dc9-1), switching to local zonegroup configuration
debug 2020-10-14T06:55:54.875+0000 7f87075b1280 -1 Cannot find zone id=
(name=cn-dc9-1)
debug 2020-10-14T06:55:54.875+0000 7f87075b1280 0 ERROR: failed to
start notify service ((22) Invalid argument
debug 2020-10-14T06:55:54.875+0000 7f87075b1280 0 ERROR: failed to init
services (ret=(22) Invalid argument)
debug 2020-10-14T06:55:54.879+0000 7f87075b1280 -1 Couldn't init storage
provider (RADOS)
I am not really sure how to get past this point.
Any pointers / help would be very appreciated.
Thanks,
- Karsten
Hello,
I have a bucket which is close to 10 millions objects (9.1 millions), we have:
rgw_dynamic_resharding = false
rgw_override_bucket_index_max_shards = 100
rgw_max_objs_per_shard = 100000
Do I need to increase the numbers soon or it is not possible so they need to start to use new bucket?
The version is luminous 12.2.8.
I don't want outage so want to be prepared.
Thank you
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
Hi Ceph users,
Im working on common lisp client utilizing rados library. Got some
results, but don't know how to estimate if i am getting correct
performance. I'm running test cluster from laptop - 2 OSDs - VM, RAM
4Gb, 4 vCPU each, monitors and mgr are running from the same VM(s). As
for storage, i have Samsung SSD 860 Pro, 512G. Disk is splitted into 2
logical volumes (LVMs), and that volumes are attached to VMs. I know
that i can't expect too much from that layout, just want to know if im
getting adequate numbers. Im doing read/write operations on very small
objects - up to 1kb. In async write im getting ~7.5-8.0 KIOPS.
Synchronouse read - pretty much the same 7.5-8.0 KIOPS. Async read is
segfaulting don't know why. Disk itself is capable to deliver well
above 50 KIOPS. Difference is magnitude. Any info is more welcome.
Daniel Mezentsev, founder
(+1) 604 313 8592.
Soleks Data Group.
Shaping the clouds.
There is a general documentation meeting called the "DocuBetter Meeting",
and it is held every two weeks. The next DocuBetter Meeting will be on 14
Oct 2020 at 1630 UTC, and will run for thirty minutes. Everyone with a
documentation-related request or complaint is invited.
The meeting will be held here: https://bluejeans.com/908675367
Send documentation-related requests and complaints to me by replying to
this email and CCing me at zac.dover(a)gmail.com.
The next DocuBetter meeting is scheduled for:
14 Oct 2020 1630 UTC
Etherpad: https://pad.ceph.com/p/Ceph_Documentation
Zac's docs whiteboard: https://pad.ceph.com/p/docs_whiteboard
Report Documentation Bugs: https://pad.ceph.com/p/Report_Documentation_Bugs
Meeting: https://bluejeans.com/908675367
Hi all,
Does anyone has any production cluster with ubuntu 20 (focal) or any
suggestion or any bugs that prevents to deploy Ceph octopus on Ubuntu 20?
Thanks.
I'm happy to announce the another release of the go-ceph API
bindings. This is a regular release following our every-two-months release
cadence.
https://github.com/ceph/go-ceph/releases/tag/v0.6.0
Changes in the release are detailed in the link above.
The bindings aim to play a similar role to the "pybind" python bindings in the
ceph tree but for the Go language. These API bindings require the use of cgo.
There are already a few consumers of this library in the wild, including the
ceph-csi project.
Specific questions, comments, bugs etc are best directed at our github issues
tracker.
--
John Mulligan
phlogistonjohn(a)asynchrono.us
jmulligan(a)redhat.com
Hello,
We are running a 14.2.7 cluster - 3 nodes with 24 OSDs, I've recently
started getting 'BlueFS spillover detected'. I'm up to 3 OSDs in this
state.
In scanning through the various online sources I haven't been able to
determine how to respond to this condition.
Please advise.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdhall(a)binghamton.edu
Hi Team,
I would like to validate cephadm on bare metal and use docker/podman as a
container.
Currently we use NUMA aware config on bare metal to improve performance .
Is there any config I can apply in cephadm to run podman/docker use run
with *–cpuset-cpus*=*num * and *–cpuset-mems*=*nodes *options in startup
scripts ?
Thanks,
Muthu