Hello Marco,
On Thu, Aug 29, 2019 at 12:55:56PM +0200, Marco Gaiarin wrote:
>
> I've just finished a double upgrade on my ceph (PVE-based) from hammer
> to jewel and from jewel to luminous.
>
> All went well, apart that... OSD does not restart automatically,
> because permission troubles on the journal:
>
> Aug 28 14:41:55 capitanmarvel ceph-osd[6645]: starting osd.2 at - osd_data /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
> Aug 28 14:41:55 capitanmarvel ceph-osd[6645]: 2019-08-28 14:41:55.449886 7fa505a43e00 -1 filestore(/var/lib/ceph/osd/ceph-2) mount(1822): failed to open journal /var/lib/ceph/osd/ceph-2/journal: (13) Permission denied
> Aug 28 14:41:55 capitanmarvel ceph-osd[6645]: 2019-08-28 14:41:55.453524 7fa505a43e00 -1 osd.2 0 OSD:init: unable to mount object store
> Aug 28 14:41:55 capitanmarvel ceph-osd[6645]: 2019-08-28 14:41:55.453535 7fa505a43e00 -1 #033[0;31m ** ERROR: osd init failed: (13) Permission denied#033[0m
>
>
> A little fast rewind: when i've setup the cluster i've used some 'old'
> servers, using a couple of SSD disks as SO and as journal.
> Because servers was old, i was forced to partition the boot disk in
> DOS, not GPT mode.
>
> While creating the OSD, i've received some warnings:
>
> WARNING:ceph-disk:Journal /dev/sdaX was not prepared with ceph-disk. Symlinking directly.
>
>
> Looking at the cluster now, seems to me that osd init scripts try to
> idetify journal based on GPT partition label/info, and clearly fail.
>
>
> Not that if i do, on servers that hold OSD:
>
> for l in $(readlink -f /var/lib/ceph/osd/ceph-*/journal); do chown ceph: $l; done
>
> OSD start flawlessy.
>
>
> There's something i can do? Thanks.
Did you go through our upgrade guide(s)? See the link [0] below, for the
permission changes. They are needed when an upgrade from Hammer to Jewel
is done.
On the wiki you can also find the upgrade guides for PVE 5.x -> 6.x and
Luminous -> Nautilus.
--
Cheers,
Alwin
[0] https://pve.proxmox.com/wiki/Ceph_Hammer_to_Jewel#Set_permission
Hi,
I'm running a small nautilus cluster (14.2.2) which was recently
upgraded from mimic (13.2.6). After the upgrade I enabled the
pg_autoscaler which resulted in most of the pools having their pg count
changed. All the remapping has completed but the cluster is still
reporting a HEALTH_WARN. I have adjusted the target ratios such that
sum < 1.0 but this didn't help. What else can I look at?
Thanks,
James
# ceph -s
cluster:
id: ...
health: HEALTH_WARN
1 subtrees have overcommitted pool target_size_bytes
1 subtrees have overcommitted pool target_size_ratio
services:
mon: 3 daemons, quorum ceph-00,ceph-01,ceph-02 (age 3d)
mgr: ceph-01(active, since 6d), standbys: ceph-02, ceph-00
osd: 32 osds: 32 up (since 2d), 32 in (since 2d)
rgw: 1 daemon active (rgw-00)
data:
pools: 14 pools, 1512 pgs
objects: 4.17M objects, 16 TiB
usage: 47 TiB used, 69 TiB / 116 TiB avail
pgs: 1510 active+clean
2 active+clean+scrubbing+deep
# ceph osd pool autoscale-status (this might wrap horribly...):
POOL SIZE TARGET SIZE RATE RAW CAPACITY
RATIO TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
loc.rgw.buckets.index 0 3.0 116.1T
0.0000 1.0 4 on
vms1 5318G 3.0 116.1T
0.1341 0.2000 1.0 256 on
vms2 3419G 3.0 116.1T
0.0862 0.0200 1.0 64 on
.rgw.root 3648k 3.0 116.1T
0.0000 1.0 4 on
default.rgw.meta 384.0k 3.0 116.1T
0.0000 1.0 4 on
lov.rgw.log 384.0k 3.0 116.1T
0.0000 1.0 4 on
vms3 35799G 3.0 116.1T
0.9028 0.6000 1.0 1024 on
default.rgw.control 0 3.0 116.1T
0.0000 1.0 4 on
loc.rgw.meta 768.5k 3.0 116.1T
0.0000 1.0 4 on
vms4 2306G 3.0 116.1T
0.0582 0.1000 1.0 128 on
loc.rgw.buckets.non-ec 200.4k 3.0 116.1T
0.0000 1.0 4 on
loc.rgw.buckets.data 56390M 3.0 116.1T
0.0014 1.0 4 on
loc.rgw.control 0 3.0 116.1T
0.0000 1.0 4 on
default.rgw.log 0 3.0 116.1T
0.0000 1.0 4 on
Zynstra is a private limited company registered in England and Wales (registered number 07864369). Our registered office and Headquarters are at The Innovation Centre, Broad Quay, Bath, BA1 1UD. This email, its contents and any attachments are confidential. If you have received this message in error please delete it from your system and advise the sender immediately.
Hi,
I have a cluster running on Ubuntu Bionic, with stock Ubuntu Ceph packages. When upgrading, I always try to follow the procedure as documented here: https://docs.ceph.com/docs/master/install/upgrading-ceph/
However, the Ubuntu packages restart all daemons upon upgrade, per node. So if I upgrade the first node, it will restart mon, osds, rgw, and mds'es on that node, even though the rest of the cluster is running the old version.
I tried upgrading a single package, to see how that goes, but due to dependencies in dpkg, all other packages are upgraded as well.
How should I proceed?
Thanks,
--
Mark Schouten <mark(a)tuxis.nl>
Tuxis, Ede, https://www.tuxis.nl
T: +31 318 200208
Hello,
I've been facing some issues with a single node ceph cluster (mimic). I
know an environment like this shouldn't be in production but the server end
up dealing with operational workloads for the last 2 years.
Some users detected some issues in cephfs; some files not being accessible
and hanging the node while trying to list the content of affected folders.
I noticed a heavy memory load on the server. Main memory was consumed by
cache as well as quite a reasonable swap.
The command "ceph health detail" reported some inactive PGs. Those PGs
didn't exist.
After rebooting the node, an fsck was run in the 3 affected OSDs.
ceph-bluestore-tool fsck --deep yes --path /var/lib/ceph/osd/ceph-1/
Unfortunately, all of them crashed with a core dump and now they don't
start anymore.
The logs report messages like:
2019-08-28 03:00:12.999 7f21d787c240 4 rocksdb:
[/build/ceph-13.2.1/src/rocksdb/db/version_set.cc:3088] Recovering from
manifest file: MANIFEST-004059
2019-08-28 03:00:12.999 7f21d787c240 4 rocksdb:
[/build/ceph-13.2.1/src/rocksdb/db/db_impl.cc:252] Shutdown: canceling all
background work
2019-08-28 03:00:12.999 7f21d787c240 4 rocksdb:
[/build/ceph-13.2.1/src/rocksdb/db/db_impl.cc:397] Shutdown complete
2019-08-28 03:00:12.999 7f21d787c240 -1 rocksdb: NotFound:
2019-08-28 03:00:12.999 7f21d787c240 -1 bluestore(/var/lib/ceph/osd/ceph-0)
_open_db erroring opening db:
2019-08-28 03:00:12.999 7f21d787c240 1 bluefs umount
2019-08-28 03:00:12.999 7f21d787c240 1 stupidalloc 0x0x5650c5255800
shutdown
2019-08-28 03:00:12.999 7f21d787c240 1 bdev(0x5650c5604a80
/var/lib/ceph/osd/ceph-0/block) close
2019-08-28 03:00:13.247 7f21d787c240 1 bdev(0x5650c5604700
/var/lib/ceph/osd/ceph-0/block) close
2019-08-28 03:00:13.479 7f21d787c240 -1 osd.0 0 OSD:init: unable to mount
object store
2019-08-28 03:00:13.479 7f21d787c240 -1 ** ERROR: osd init failed: (5)
Input/output error
I'm not sure if the fsck has introduced additional damage.
After that, I tried to mark unfound as lost with the following commands:
ceph pg 4.1e mark_unfound_lost revert
ceph pg 9.1d mark_unfound_lost revert
ceph pg 13.3 mark_unfound_lost revert
ceph pg 13.e mark_unfound_lost revert
Currently, since there are 3 OSD down, there are:
316 unclean PGs
76 inactive PGs
root@ceph-s01:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-2 0.43599 root ssd
-4 0.43599 disktype ssd_disk
12 ssd 0.43599 osd.12 up 1.00000 1.00000
-1 60.03792 root default
-5 60.03792 disktype hdd_disk
0 hdd 0 osd.0 down 1.00000 1.00000
1 hdd 5.45799 osd.1 down 0 1.00000
2 hdd 5.45799 osd.2 up 1.00000 1.00000
3 hdd 5.45799 osd.3 up 1.00000 1.00000
4 hdd 5.45799 osd.4 up 1.00000 1.00000
5 hdd 5.45799 osd.5 up 1.00000 1.00000
6 hdd 5.45799 osd.6 up 1.00000 1.00000
7 hdd 5.45799 osd.7 down 0 1.00000
8 hdd 5.45799 osd.8 up 1.00000 1.00000
9 hdd 5.45799 osd.9 up 1.00000 1.00000
10 hdd 5.45799 osd.10 up 1.00000 1.00000
11 hdd 5.45799 osd.11 up 1.00000 1.00000
Running the following command, a MANIFEST file appeared in the folder
db/lost. I guess that the repair moved here.
# ceph-bluestore-tool bluefs-export --path /var/lib/ceph/osd/ceph-7
--out-dir osd7/
...
db/LOCK
db/MANIFEST-000001
db/OPTIONS-018543
db/OPTIONS-018581
db/lost/
db/lost/MANIFEST-018578
Any ideas? Suggestions?
Thank you.
Regards,
Jordi
I have an OSD that is throwing sense errors - It's at it's end of life and needs to be replaced.
The server is in the datacentre and I won't get there for a few weeks so I've stopped the service (systemctl stop ceph-osd@208) and let the cluster rebalance, all is well.
My thinking is that if for some reason the host that OSD208 resides within was to reboot, that OSD would start and become part of the cluster again.
So I'd like to prevent this OSD from ever starting again without physically being able to remove it from the server.
I was thinking that deleting it's key from the auth list might work. So a ceph osd purge 208
Then when the service tries to start it'll fail with an auth error.
Any other suggestions?
Cheers,
Cory
Sorry to post this to the list, but does this lists.ceph.io password
reset work for anyone?
https://lists.ceph.io/accounts/password/reset/
For my accounts which are getting mail I have "The e-mail address is
not assigned to any user account".
Best Regards, Dan
Hi Dominic,
I just created a feature ticket in the Ceph tracker to keep track of
this issue.
Here's the ticket: https://tracker.ceph.com/issues/41537
Cheers,
Ricardo Dias
On 17/07/19 20:06, DHilsbos(a)performair.com wrote:
> All;
>
> I'm trying to firm up my understanding of how Ceph works, and ease of management tools and capabilities.
>
> I stumbled upon this: http://docs.ceph.com/docs/nautilus/rados/configuration/mon-lookup-dns/
>
> It got me wondering; how do you convey protocol version 2 capabilities in this format?
>
> The examples all list port 6789, which is the port for protocol version 1. Would I add SRV records for port 3300? How does the client distinguish v1 from v2 in this case?
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director - Information Technology
> Perform Air International, Inc.
> DHilsbos(a)PerformAir.com
> www.PerformAir.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users(a)lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
--
Ricardo Dias
Senior Software Engineer - Storage Team
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284
(AG Nürnberg)
Hi,
We have all SSD disks as ceph's backend storage.
Consider the cost factor, can we setup the cluster to have only two
replicas for objects?
thanks & regards
Wesley
It seems that with Linux kernel 4.16.10 krdb clients are seen as Jewel
rather than Luminous. Can someone tell me which kernel version will be seen
as Luminous as I want to enable the Upmap Balancer.