Riprendo quanto scritto nel suo messaggio del 29/08/2019...
> Another possibilty is to convert the MBR to GPT (sgdisk --mbrtogpt) and
> give the partition its UID (also sgdisk). Then it could be linked by
> its uuid.
and, in another email:
> And I forgot that you can also re-create the journal by itself. I can't
> recall the command ATM though.
Ahem, i stated the jornal disk are also the OS disks, and i'm using old
server, so i think that converting to GPT will lead to an unbootable
node...
But, the 'code' that identify (and change permission) for journal dev
are PVE specific? or Ceph generic? I suppose the latter...
Also, i've done:
adduser ceph disk
and partition devices are '660 root:disk': why still i get 'permission
denied'?
> Or if you are not in need of filestore OSDs, re-create them as bluestore
> ones. AFAICS, Ceph has laid more focus on bluestore and it might be
> better to do a conversion sooner than later. (my opinion)
Not for now; bluestore migration need a bit more
time/study/knowledge...
--
dott. Marco Gaiarin GNUPG Key ID: 240A3D66
Associazione ``La Nostra Famiglia'' http://www.lanostrafamiglia.it/
Polo FVG - Via della Bontà, 7 - 33078 - San Vito al Tagliamento (PN)
marco.gaiarin(at)lanostrafamiglia.it t +39-0434-842711 f +39-0434-842797
Dona il 5 PER MILLE a LA NOSTRA FAMIGLIA!
http://www.lanostrafamiglia.it/index.php/it/sostienici/5x1000
(cf 00307430132, categoria ONLUS oppure RICERCA SANITARIA)
I have 2 OpenStack environment that I want to integrate to an existing ceph
cluster. I know technically it can be done but has anyone tried this?
- Vlad
ᐧ
All;
We're trying to add a RADOSGW instance to our new production cluster, and it's not showing in the dashboard, or in ceph -s.
The cluster is running 14.2.2, and the RADOSGW got 14.2.3.
systemctl status ceph-radosgw@ rgw.s700037 returns: active (running).
ss -ntlp does NOT show port 80.
Here's the ceph.conf on the system:
[global]
fsid = effc5134-e0cc-4628-a079-d67b60071f90
mon initial members = s700034,s700035,s700036
mon host = [v1:10.0.80.10:6789/0,v2:10.0.80.10:3300/0],[v1:10.0.80.11:6789/0,v2:10.0.80.11:3300/0],[v1:10.0.80.12:6789/0,v2:10.0.80.12:3300/0]
public network = 10.0.80.0/24
cluster network = 10.0.88.0/24
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd journal size = 1024
osd pool default size = 3
osd pool default min size = 2
osd pool default pg num = 8
osd pool default pgp num = 8
[client.rgw.s700037]
host = s700037.performair.local
rgw frontends = "civetweb port=80"
rgw dns name = radosgw.performair.local
Any thoughts on what I'm missing?
I'm also seeing these in the manager's logs:
2019-09-10 15:49:43.946 7efe6eee1700 0 mgr[dashboard] [10/Sep/2019:15:49:43] ENGINE Error in HTTPServer.tick
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/cherrypy/wsgiserver/wsgiserver2.py", line 1837, in start
self.tick()
File "/usr/lib/python2.7/site-packages/cherrypy/wsgiserver/wsgiserver2.py", line 1902, in tick
s, ssl_env = self.ssl_adapter.wrap(s)
File "/usr/lib/python2.7/site-packages/cherrypy/wsgiserver/ssl_builtin.py", line 52, in wrap
keyfile=self.private_key, ssl_version=ssl.PROTOCOL_SSLv23)
File "/usr/lib64/python2.7/ssl.py", line 934, in wrap_socket
ciphers=ciphers)
File "/usr/lib64/python2.7/ssl.py", line 609, in __init__
self.do_handshake()
File "/usr/lib64/python2.7/ssl.py", line 831, in do_handshake
self._sslobj.do_handshake()
SSLError: [SSL: SSLV3_ALERT_CERTIFICATE_UNKNOWN] sslv3 alert certificate unknown (_ssl.c:618)
Thoughts on this?
Thank you,
Dominic L. Hilsbos, MBA
Director - Information Technology
Perform Air International Inc.
DHilsbos(a)PerformAir.com
www.PerformAir.com
Hello,
Running 14.2.3, updated from 14.2.1.
Until recently I've had ceph-mgr collocated with OSDs. I've installed
ceph-mgr on separate servers and everything looks OK in Ceph status
but there are multiple issues:
1. Dashboard only runs on old mgr servers. Tried restarting the
daemons and disable/enable the dashboard plugin. New mgr won't listen
on the dashboard port.
2. To (re)enable the dashboard plugin I had to use "--force"
# ceph mgr module enable dashboard
Error ENOENT: all mgr daemons do not support module 'dashboard',
pass --force to force enablement
3. When accessing the Cluster -> Manager modules menu in the dashboard
I get a 500 error message. The exact error below:
----
2019-09-10 15:01:39.270 7fb6d4916700 0 mgr[dashboard]
[10/Sep/2019:15:01:39] HTTP Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/cherrypy/_cprequest.py", line
656, in respond
response.body = self.handler()
File "/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py",
line 188, in __call__
self.body = self.oldhandler(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/cherrypy/_cptools.py", line
221, in wrap
return self.newhandler(innerfunc, *args, **kwargs)
File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 88,
in dashboard_exception_handler
return handler(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py",
line 34, in __call__
return self.callable(*self.args, **self.kwargs)
File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line
649, in inner
ret = func(*args, **kwargs)
File "/usr/share/ceph/mgr/dashboard/controllers/__init__.py", line
842, in wrapper
return func(*vpath, **params)
File "/usr/share/ceph/mgr/dashboard/controllers/mgr_modules.py",
line 35, in list
obj['enabled'] = True
TypeError: 'NoneType' object does not support item assignment
2019-09-10 15:01:39.271 7fb6d4916700 0 mgr[dashboard]
[::ffff:192.168.15.55:54860] [GET] [500] [0.014s] [admin] [1.3K]
/api/mgr/module
2019-09-10 15:01:39.272 7fb6d4916700 0 mgr[dashboard] ['{"status":
"500 Internal Server Error", "version": "3.2.2", "detail": "The server
encountered an unexpected condition which prevented it from fulfilling
the request.", "traceback": "Traceback (most recent call last):\\n
File \\"/usr/lib/python2.7/site-packages/cherrypy/_cprequest.py\\",
line 656, in respond\\n response.body = self.handler()\\n File
\\"/usr/lib/python2.7/site-packages/cherrypy/lib/encoding.py\\", line
188, in __call__\\n self.body = self.oldhandler(*args, **kwargs)\\n
File \\"/usr/lib/python2.7/site-packages/cherrypy/_cptools.py\\",
line 221, in wrap\\n return self.newhandler(innerfunc, *args,
**kwargs)\\n File
\\"/usr/share/ceph/mgr/dashboard/services/exception.py\\", line 88, in
dashboard_exception_handler\\n return handler(*args, **kwargs)\\n
File \\"/usr/lib/python2.7/site-packages/cherrypy/_cpdispatch.py\\",
line 34, in __call__\\n return self.callable(*self.args,
**self.kwargs)\\n File
\\"/usr/share/ceph/mgr/dashboard/controllers/__init__.py\\", line 649,
in inner\\n ret = func(*args, **kwargs)\\n File
\\"/usr/share/ceph/mgr/dashboard/controllers/__init__.py\\", line 842,
in wrapper\\n return func(*vpath, **params)\\n File
\\"/usr/share/ceph/mgr/dashboard/controllers/mgr_modules.py\\", line
35, in list\\n obj[\'enabled\'] = True\\nTypeError: \'NoneType\'
object does not support item assignment\\n"}']
----
Anyone got the same problems after adding new manager nodes? Is there
something I'm missing here?
Thanks!
---
Alex Cucu
Hi,
I am using ceph version 13.2.6 (mimic) on test setup trying with cephfs.
My current setup:
3 nodes, 1 node contain two bricks and other 2 nodes contain single brick
each.
Volume is a 3 replica, I am trying to simulate node failure.
I powered down one host and started getting msg in other systems when
running any command
"-bash: fork: Cannot allocate memory" and system not responding to commands.
what could be the reason for this?
at this stage, I could able to read some of the data stored in the volume
and some just waiting for IO.
output from "sudo ceph -s"
cluster:
id: 7c138e13-7b98-4309-b591-d4091a1742b4
health: HEALTH_WARN
1 osds down
2 hosts (3 osds) down
Degraded data redundancy: 5313488/7970232 objects degraded
(66.667%), 64 pgs degraded
services:
mon: 1 daemons, quorum mon01
mgr: mon01(active)
mds: cephfs-tst-1/1/1 up {0=mon01=up:active}
osd: 4 osds: 1 up, 2 in
data:
pools: 2 pools, 64 pgs
objects: 2.66 M objects, 206 GiB
usage: 421 GiB used, 3.2 TiB / 3.6 TiB avail
pgs: 5313488/7970232 objects degraded (66.667%)
64 active+undersized+degraded
io:
client: 79 MiB/s rd, 24 op/s rd, 0 op/s wr
output from : sudo ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 hdd 1.81940 0 0 B 0 B 0 B 0 0 0
3 hdd 1.81940 0 0 B 0 B 0 B 0 0 0
1 hdd 1.81940 1.00000 1.8 TiB 211 GiB 1.6 TiB 11.34 1.00 0
2 hdd 1.81940 1.00000 1.8 TiB 210 GiB 1.6 TiB 11.28 1.00 64
TOTAL 3.6 TiB 421 GiB 3.2 TiB 11.31
MIN/MAX VAR: 1.00/1.00 STDDEV: 0.03
regards
Amudhan
Hi,
I am using ceph version 13.2.6 (mimic) on test setup trying with cephfs.
My current setup:
3 nodes, 1 node contain two bricks and other 2 nodes contain single brick
each.
Volume is a 3 replica, I am trying to simulate node failure.
I powered down one host and started getting msg in other systems when
running any command
"-bash: fork: Cannot allocate memory" and system not responding to commands.
what could be the reason for this?
at this stage, I could able to read some of the data stored in the volume
and some just waiting for IO.
output from "sudo ceph -s"
cluster:
id: 7c138e13-7b98-4309-b591-d4091a1742b4
health: HEALTH_WARN
1 osds down
2 hosts (3 osds) down
Degraded data redundancy: 5313488/7970232 objects degraded
(66.667%), 64 pgs degraded
services:
mon: 1 daemons, quorum mon01
mgr: mon01(active)
mds: cephfs-tst-1/1/1 up {0=mon01=up:active}
osd: 4 osds: 1 up, 2 in
data:
pools: 2 pools, 64 pgs
objects: 2.66 M objects, 206 GiB
usage: 421 GiB used, 3.2 TiB / 3.6 TiB avail
pgs: 5313488/7970232 objects degraded (66.667%)
64 active+undersized+degraded
io:
client: 79 MiB/s rd, 24 op/s rd, 0 op/s wr
output from : sudo ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
0 hdd 1.81940 0 0 B 0 B 0 B 0 0 0
3 hdd 1.81940 0 0 B 0 B 0 B 0 0 0
1 hdd 1.81940 1.00000 1.8 TiB 211 GiB 1.6 TiB 11.34 1.00 0
2 hdd 1.81940 1.00000 1.8 TiB 210 GiB 1.6 TiB 11.28 1.00 64
TOTAL 3.6 TiB 421 GiB 3.2 TiB 11.31
MIN/MAX VAR: 1.00/1.00 STDDEV: 0.03
regards
Amudhan
Hi Team,
I have a production Openstack which was deployed using kolla-ansible and
docker containerization has been used to deploy Ceph cluster storage.
Now, 1/4 OSD is down. How can I bring it up.
Basically, management of docker containerized application and restart the
services or bring up the OSD services on one of the Ceph node.
Can you please support me if any of you aware of this?
Thanks,
Reddi Prasad YENDLURI
Cloud Specialist
M +65 8345 9599 | D +65 6220 9908
Office: *51B Circular Road Singapore 049406*
[image: PALO IT]
Hi,
we had a failing hard disk, and I replace it and want to create a new
OSD on it now.
But ceph-volume fails under these circumstances. In the original setup,
the OSDs were created with ceph-volume lvm batch using a bunch of drives
and a NVMe device for bluestore db. The batch mode uses a volume group
on the NVMe device instead of partitions.I have removed the former db
logical volume, the lvm setup for the former hard disk and all other
remainders. Creating a new OSD with any combination of devices now fails:
--data /dev/sda --block.db <nvme device>
--data /dev/sda --block.db <nvme volume group>
--data /dev/sda --block.db <lv created manually in nvme volume group>
# ceph-volume lvm create --bluestore --data /dev/sda --block.db
/dev/ceph-block-dbs-ea684aa8-544e-4c4a-8664-6cb50b3116b8/osd-block-db-a8f1489a-d97b-479e-b9a7-30fc9fa99cb5
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
-i - osd new 55cfb9f8-aa30-4f8b-8b95-a43d3f97fe5b
Running command: /sbin/vgcreate -s 1G --force --yes
ceph-1a2ddc14-780b-45e4-a036-e928862e6ccb /dev/sda
stdout: Physical volume "/dev/sda" successfully created.
stdout: Volume group "ceph-1a2ddc14-780b-45e4-a036-e928862e6ccb"
successfully created
Running command: /sbin/lvcreate --yes -l 100%FREE -n
osd-block-55cfb9f8-aa30-4f8b-8b95-a43d3f97fe5b
ceph-1a2ddc14-780b-45e4-a036-e928862e6ccb
stdout: Logical volume
"osd-block-55cfb9f8-aa30-4f8b-8b95-a43d3f97fe5b" created.
--> blkid could not detect a PARTUUID for device:
/dev/ceph-block-dbs-ea684aa8-544e-4c4a-8664-6cb50b3116b8/osd-block-db-a8f1489a-d97b-479e-b9a7-30fc9fa99cb5
--> Was unable to complete a new OSD, will rollback changes
Running command: /usr/bin/ceph --cluster ceph --name
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
osd purge-new osd.136 --yes-i-really-mean-it
stderr: purged osd.136
--> RuntimeError: unable to use device
In all cases ceph-volume is not able to detect a partition uuid for the
db device (which is correct, since the device is a logical volume....).
Running 'ceph-volume lvm batch' again results in a OSD without using the
NVMe device as db.
So what is the recommended way to manually create an OSD with a certain
hard disk and an existing logical volume as db partition? I would like
to avoid to zap all other OSDs using the NVMe device and recreate them
in a single run with 'ceph-volume lvm batch ...'.
Regards,
Burkhard