Hi all,
We run a simple single zone nautilus radosGW instance with a few gateway machines for some of our users. I've got some more gateway machines earmarked for the purpose of adding some OpenStack Keystone integrated RadosGW gateways to the cluster. I'm not sure how best to add them alongside the exiting radosGWs gateways/infrastructure. The options I think I have are:
1) Add keystone integration to all radosGWs gateways. Simplest, but I have (possibly unfounded) concerns about issues with keystone causing problems for non OpenStack users (added authentication latency), and I'm not sure I fully understand how the OpenStack users/buckets will interact with our existing users.
2) Add keystone integration to separate gateways. This keeps the radosGW servers separate, and deals with one of my concerns above.
3) Add a separate radosGW zone/instance (not sure what the correct term is), and have separate gateways for this instance. Seems very heavyweight for what I'm trying to achieve, but that may be my inexperience talking.
4) Something else entirely?
Any advice would be greatly appreciated!
Cheers,
Tom
This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. Opinions, conclusions or other information in this message and attachments that are not related directly to UKRI business are solely those of the author and do not represent the views of UKRI.
Dear all,
maybe someone can give me a pointer here. We are running OpenNebula with ceph RBD as a back-end store. We have a pool of spinning disks to create large low-demand data disks, mainly for backups and other cold storage. Everything is fine when using linux VMs. However, Windows VMs perform poorly, they are like a factor 20 slower than a similarly created linux VM.
If anyone has pointers what to look for, we would be very grateful.
The OpenNebula installation is more or less default. The current OS and libvirt versions we use are:
Centos 7.6 with stock kernel 3.10.0-1062.1.1.el7.x86_64
libvirt-client.x86_64 4.5.0-23.el7_7.1 @updates
qemu-kvm-ev.x86_64 10:2.12.0-33.1.el7 @centos-qemu-ev
Some benchmark results from good to worse workloads:
rbd bench --io-size 4M --io-total 4G --io-pattern seq --io-type write --io-threads 16 : 450MB/s
rbd bench --io-size 4M --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 230MB/s
rbd bench --io-size 1M --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 190MB/s
rbd bench --io-size 64K --io-total 4G --io-pattern seq --io-type write --io-threads 1 : 150MB/s
rbd bench --io-size 64K --io-total 1G --io-pattern rand --io-type write --io-threads 1 : 26MB/s
dd with conv=fdatasync gives awesome 500MB/s inside linux VM for sequential write of 4GB.
We copied a couple of large ISO files inside the Windows VM and for the first ca. 1 to 1.5G it performs as expected. Thereafter, however, write speed drops rapidly to ca. 25MB/s and does not recover. It is almost as if Windows translates large sequential writes to small random writes.
If anyone has seen and solved this before, please let us know.
Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Hello. I'm trying to upgrade to ceph 15.2.4 from 15.2.3. The upgrade is
almost finished, but it has entered in a service start/stop loop. I'm using
a container deployment over Debian 10 with 4 nodes. The problem is with a
service named literally "mds.label:mds". It has the colon character, which
is of special use in docker. This character can't appear in the name of the
container and also breaks the volumen binding syntax.
I have seen in the /var/lib/ceph/UUID/ the files for this service:
root@ceph-admin:/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca# ls -la
total 48
drwx------ 12 167 167 4096 jul 10 02:54 .
drwxr-x--- 3 ceph ceph 4096 jun 24 16:36 ..
drwx------ 3 nobody nogroup 4096 jun 24 16:37 alertmanager.ceph-admin
drwx------ 3 167 167 4096 jun 24 16:36 crash
drwx------ 2 167 167 4096 jul 10 01:35 crash.ceph-admin
drwx------ 4 998 996 4096 jun 24 16:38 grafana.ceph-admin
drwx------ 2 167 167 4096 jul 10 02:55
mds.label:mds.ceph-admin.rwmtkr
drwx------ 2 167 167 4096 jul 10 01:33 mgr.ceph-admin.doljkl
drwx------ 3 167 167 4096 jul 10 01:34 mon.ceph-admin
drwx------ 2 nobody nogroup 4096 jun 24 16:38 node-exporter.ceph-admin
drwx------ 4 nobody nogroup 4096 jun 24 16:38 prometheus.ceph-admin
drwx------ 4 root root 4096 jul 3 02:43 removed
root@ceph-admin:/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.label:mds.ceph-admin.rwmtkr#
ls -la
total 32
drwx------ 2 167 167 4096 jul 10 02:55 .
drwx------ 12 167 167 4096 jul 10 02:54 ..
-rw------- 1 167 167 295 jul 10 02:55 config
-rw------- 1 167 167 152 jul 10 02:55 keyring
-rw------- 1 167 167 38 jul 10 02:55 unit.configured
-rw------- 1 167 167 48 jul 10 02:54 unit.created
-rw------- 1 root root 24 jul 10 02:55 unit.image
-rw------- 1 root root 0 jul 10 02:55 unit.poststop
-rw------- 1 root root 981 jul 10 02:55 unit.run
root@ceph-admin:/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.label:mds.ceph-admin.rwmtkr#
cat unit.run
/usr/bin/install -d -m0770 -o 167 -g 167
/var/run/ceph/0ce93550-b628-11ea-9484-f6dc192416ca
/usr/bin/docker run --rm --net=host --ipc=host --name
ceph-0ce93550-b628-11ea-9484-f6dc192416ca-mds.label:mds.ceph-admin.rwmtkr
-e CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=ceph-admin -v
/var/ru
n/ceph/0ce93550-b628-11ea-9484-f6dc192416ca:/var/run/ceph:z -v
/var/log/ceph/0ce93550-b628-11ea-9484-f6dc192416ca:/var/log/ceph:z -v
/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/crash:/var/lib/ceph/c
rash:z -v
/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.label:mds.ceph-admin.rwmtkr:/var/lib/ceph/mds/ceph-label:mds.ceph-admin.rwmtkr:z
-v /var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.l
abel:mds.ceph-admin.rwmtkr/config:/etc/ceph/ceph.conf:z --entrypoint
/usr/bin/ceph-mds docker.io/ceph/ceph:v15 -n
mds.label:mds.ceph-admin.rwmtkr -f --setuser ceph --setgroup ceph
--default-log-to-file=fal
se --default-log-to-stderr=true --default-log-stderr-prefix="debug "
If I try to manually run the docker command, this is the error:
docker: Error response from daemon: Invalid container name
(ceph-0ce93550-b628-11ea-9484-f6dc192416ca-mds.label:mds.ceph-admin.rwmtkr),
only [a-zA-Z0-9][a-zA-Z0-9_.-] are allowed.
If I try with a different container name, then the volume binding error
rises:
docker: Error response from daemon: invalid volume specification:
'/var/lib/ceph/0ce93550-b628-11ea-9484-f6dc192416ca/mds.label:mds.ceph-admin.rwmtkr:/var/lib/ceph/mds/ceph-label:mds.ceph-admin.rwmtkr:z'.
This mds is not needed and I would be happy simply removing it, but I don't
know how to do it. The documentation says how to do it for "normal"
services, but my installation is a container deployment. I have tried to
remove the directory and restart the upgrading process but then the
directory with this service appears again.
Please, how can I remove or rename this service so I can complete the
upgrading?
Also, I think it's a bug to allow docker-forbidden characters in the
service names when using container deployment and it should be checked.
Thank you very much.
--
*Mario J. Barchéin Molina*
*Departamento de I+D+i*
mario(a)intelligenia.com
Madrid: +34 911 86 35 46
US: +1 (918) 856 - 3838
Granada: +34 958 07 70 70
――
intelligenia · Intelligent Engineering · Web & APP & Intranet
www.intelligenia.com · @intelligenia <http://twitter.com/intelligenia> ·
fb.com/intelligenia · blog.intelligenia.com
Madrid · C/ de la Alameda 22, 28014, Madrid, Spain
<https://maps.google.com/?q=C/+de+la+Alameda+22,+28014,+Madrid,+Spain&entry=…>
Miami · 2153 Coral Way #400, Miami, FL, US, 33145
<https://www.google.es/maps/place/2153+Coral+Way+%23400,+Miami,+FL+33145/@25…>
Granada · C/ Luis Amador nº 24, 18014, Granada, Spain
<https://www.google.es/maps/place/intelligenia/@37.1947393,-3.6170297,17z/da…>
*PROTECCIÓN DE DATOS*: le informamos que los datos personales y dirección
de correo electrónico, recabados del propio interesado, serán tratados bajo
la responsabilidad de Intelligenia Soluciones Informáticas, S.L. para el
envío de comunicaciones sobre nuestros servicios y se conservarán mientras
exista un interés mutuo para ello. Los datos no serán comunicados a
terceros, salvo obligación legal. Le informamos que puede ejercer los
derechos de acceso, rectificación, portabilidad y supresión de sus datos y
los de limitación y oposición a su tratamiento dirigiéndose a C/ Luis
Amador (Centro de negocios Cámara) 24 , 18014 Granada. Si considera que el
tratamiento no se ajusta a la normativa vigente, podrá presentar una
reclamación ante la autoridad de control en aepd.es.
En cumplimiento de lo previsto en el artículo 21 de la Ley 34/2002 de
Servicios de la Sociedad de la Información y Comercio Electrónico (LSSICE),
si usted no desea recibir más información sobre nuestros servicios, puede
darse de baja enviando un correo electrónico a info(a)intelligenia.com indicando
en el *Asunto *"*BAJA*" o "*NO ENVIAR*".
Hi Ceph Users,
I'm struggling with an issue that I'm hoping someone can point me towards a solution.
We are using Nautilus (14.2.9) deploying Ceph in containers, in VMs. The setup that I'm working with has 3 VMs, but of-course our design expects this to be scaled by a user as appropriate. I have a cluster deployed and it's functioning happily as storage for our product, the error occurs when I go to setup a second cluster and pair it with the first. I'm using ceph-ansible to deploy. I get the following error about 20 minutes into running the site-container playbook.
2020-07-09 14:21:10,966 p=2134 u=qs-admin | TASK [ceph-rgw : fetch the realm] ***********************************************************************************************
************************************************************************************
2020-07-09 14:21:10,966 p=2134 u=qs-admin | Thursday 09 July 2020 14:21:10 +0000 (0:00:00.410) 0:16:18.245 *********
2020-07-09 14:21:11,901 p=2134 u=qs-admin | fatal: [10.225.21.213 -> 10.225.21.213]: FAILED! => changed=true
cmd:
- docker
- exec
- ceph-mon-albamons_sc2
- radosgw-admin
- realm
- pull
- --url=https://10.225.36.197:7480
- --access-key=2CQ006Lereqpysbr0l0s
- --secret=JM3S5Hd49Nz03eIbTTNnEyqcXJkIOXbp0gWIUEbp
delta: '0:00:00.545895'
end: '2020-07-09 14:21:11.516539'
msg: non-zero return code
rc: 13
start: '2020-07-09 14:21:10.970644'
stderr: |-
request failed: (13) Permission denied
If the realm has been changed on the master zone, the master zone's gateway may need to be restarted to recognize this user.
stderr_lines: <omitted>
stdout: ''
stdout_lines: <omitted>
Re-running the command manually reproduces the error. I understand that the permission denied error appears to indicate the keys are not valid, suggested by https://tracker.ceph.com/issues/36619. However, I've triple checked the keys are correct on the other site. I'm at a loss of where to look for debugging, I've turned up logs on both the local and remote site for RGW and MON processes but neither seem to yield anything related. I've tried restarting everything as suggested in the error text from all the processes to a full reboot of all the VMs. I've no idea why the keys are being declined either, as they are correct (or atleast `radosgw-admin period get` on the primary site thinks so).
Thanks for your help,
Alex
Hello! For ceph nautilus v14.2.10, I cannot used "compaction_threads and flusher_threads" 。
Why restrict bluestore_rocksdb_options to setting parameters?
Thank you all for helping me understand clearly about the 'size'.
Ml Ml <mliebherr99(a)googlemail.com> 于2020年7月10日周五 下午11:08写道:
> If size is 2 and one disks fails you are already going to be in error
> state with read only.
>
> Let's say you reboot one node, you will instantly get into trouble.
>
> If you are going to reboot one node and at the same time the other disk
> fails, then you very like loose data.
>
> Just never ever use size 2. Not even temporary :)
>
>
> Zhenshi Zhou <deaderzzs(a)gmail.com> schrieb am Fr., 10. Juli 2020, 04:11:
>
>> Hi,
>>
>> As we all know, the default replica setting of 'size' is 3 which means
>> there
>> are 3 copies of an object. What is the disadvantages if I set it to 2,
>> except
>> I get fewer copies?
>>
>> Thanks
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>>
>
Hello,
Ich need to install in CentOS 7 Ceph with Ansible. I searched at Ansible
Galaxy and some Websites for a good Howto and Playbook.
Does anyone have a good Howto to do this?
Thanks for help.
Regards
Hauke
--
www.compi-creative.net
Hi Cephers,
Can someone please share about the research and industrial conferences
where one can publish Ceph related new research results? Additionally, are
there any conferences which are particularly interested in Ceph results? I
would like to know all suitable conferences. Thanks :-)
Looking forward to hearing from you.
BR
Bobby !!
Hi,
our cluster is on Octopus 15.2.4. We noticed that our MON all ran out of
space yesterday because the store.db folder kept growing until it filled
up the filesystem. We added more space to the MON nodes but store.db
keeps growing.
Right now it's ~220GiB on the two MON nodes that are active. We shut
down on MON node when it hit ~98GiB and it seems that it trimmed its
local store.db down to 102MiB and now also keeps growing again.
Checking the keys in store.db while the MON is offline shows a lot of
"logm" and "osdmap" keys:
ceph-monstore-tool <path> dump-keys|awk '{print $1}'|uniq -c
86 auth
2 config
11 health
275929 logm
55 mds_health
1 mds_metadata
602 mdsmap
599 mgr
1 mgr_command_descs
3 mgr_metadata
209 mgrstat
461 mon_config_key
1 mon_sync
7 monitor
1 monitor_store
7 monmap
454 osd_metadata
1 osd_pg_creating
4804 osd_snap
138366 osdmap
538 paxos
5 pgmap
I already tried compacting it with "ceph tell ..." and
"ceph-monstore-tool <path> compact" but it stayed the same size. Also
copying it with "ceph-monstore-tool <path> store-copy <new-path>" just
created a copy of the same size.
Out cluster is currently in WARN status because we are low on space and
several OSDs are in a backfill_full state. Could this be related?
Regards,
Michael