Hi all,
We have a 12 OSD node cluster in which I just recently found out that 'osd_crush_chooseleaf_type = 0' made it's way into our ceph.conf file, probably from previous testing. I believe this is the reason a recent maintenance on an OSD node caused data to stop flowing. In researching how to fix this, I just wanted to confirm a few things and see if anybody who has done this before has any perspective or things to look out for.
1) I believe the correct way to fix this is by following the 5 step method in the documentation; Get, Decompile, Edit, Recompile, Set. Is that correct and is the line I should change 'choose_firstn' to 'chooseleaf_firstn'? Do I only make this change on 1 mon and it will propagate it to all other mons and osds?
2) Does the process start immediately following the setcrushmap command?
3) Any files to backup prior to this operation? This is production data so we can not have any data loss.
4) Any other notes/things to be aware of?
Thank you
Hi
Recently our ceph cluster (nautilus) is experiencing bluefs spillovers,
just 2 osd's and I disabled the warning for these osds.
(ceph config set osd.125 bluestore_warn_on_bluefs_spillover false)
I'm wondering what causes this and how this can be prevented.
As I understand it the rocksdb for the OSD needs to store more than fits
on the NVME logical volume (123G for 12T OSD). A way to fix it could be
to increase the logical volume on the nvme (if there was space on the
nvme, which there isn't at the moment).
This is the current size of the cluster and how much is free:
[root@cephmon1 ~]# ceph df
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 1.8 PiB 842 TiB 974 TiB 974 TiB 53.63
TOTAL 1.8 PiB 842 TiB 974 TiB 974 TiB 53.63
POOLS:
POOL ID STORED OBJECTS USED
%USED MAX AVAIL
cephfs_data 1 572 MiB 121.26M 2.4 GiB
0 167 TiB
cephfs_metadata 2 56 GiB 5.15M 57 GiB
0 167 TiB
cephfs_data_3copy 8 201 GiB 51.68k 602 GiB
0.09 222 TiB
cephfs_data_ec83 13 643 TiB 279.75M 953 TiB
58.86 485 TiB
rbd 14 21 GiB 5.66k 64 GiB
0 222 TiB
.rgw.root 15 1.2 KiB 4 1 MiB
0 167 TiB
default.rgw.control 16 0 B 8 0 B
0 167 TiB
default.rgw.meta 17 765 B 4 1 MiB
0 167 TiB
default.rgw.log 18 0 B 207 0 B
0 167 TiB
cephfs_data_ec57 20 433 MiB 230 1.2 GiB
0 278 TiB
The amount used can still grow a bit before we need to add nodes, but
apparently we are running into the limits of our rocskdb partitions.
Did we choose a parameter (e.g. minimal object size) too small, so we
have too much objects on these spillover OSDs? Or is it that too many
small files are stored on the cephfs filesystems?
When we expand the cluster, we can choose larger nvme devices to allow
larger rocksdb partitions, but is that the right way to deal with this,
or should we adjust some parameters on the cluster that will reduce the
rocksdb size?
Cheers
/Simon
I am a professional person and dealing with small and big customers. So I need an emailing application to communicate with them online. On a daily basis, I have to make many business deals with the customers. So I need to send business deals details. So I have decided to set up a roadrunner email account on my laptop. Roadrunner email service is an admirable emailing application, which is mainly used by the countless users. It is mainly used by both home and business users. I am also a small business person and look for choosing roadrunner email service for my daily emailing communication. Due to a lack of technical knowledge, I am not able to configure a roadrunner email account on my computer system. I don’t have complete technical knowledge about the roadrunner email settings. As per my technical knowledge, I am applying all right things for the account settings, even though I am failing to set up the roadrunner email account successfully. When I try to set up a roadrunner email account on my computer system, I am experiencing technical difficulty for the configuration process of the roadrunner email account. So anyone can recommend the simple ways to set up a roadrunner email account correctly.
https://www.emailsupport.us/blog/configure-roadrunner-email-settings
Good afternoon!
I have a small Ceph-cluster running with Proxmox, and after an update on
one of the nodes and a reboot. So far so good.
But after a couple of hours, I saw this:
root@pve2:~# ceph health detail
HEALTH_ERR 16/1101836 objects unfound (0.001%); Possible data damage: 2
pgs recovery_unfound; Degraded data redundancy: 48/3305508 objects
degraded (0.001%), 2 pgs degraded, 2 pgs undersized
OBJECT_UNFOUND 16/1101836 objects unfound (0.001%)
pg 1.37 has 6 unfound objects
pg 1.48 has 10 unfound objects
PG_DAMAGED Possible data damage: 2 pgs recovery_unfound
pg 1.37 is active+recovery_unfound+undersized+degraded+remapped,
acting [11,17], 6 unfound
pg 1.48 is active+recovery_unfound+undersized+degraded+remapped,
acting [5,11], 10 unfound
PG_DEGRADED Degraded data redundancy: 48/3305508 objects degraded
(0.001%), 2 pgs degraded, 2 pgs undersized
pg 1.37 is stuck undersized for 446774.454853, current state
active+recovery_unfound+undersized+degraded+remapped, last acting
[11,17]
pg 1.48 is stuck undersized for 446774.459466, current state
active+recovery_unfound+undersized+degraded+remapped, last acting [5,11]
root@pve2:~# ceph -s
cluster:
id: 76e70c34-bce9-4f86-b049-0054f21c3494
health: HEALTH_ERR
16/1101836 objects unfound (0.001%)
Possible data damage: 2 pgs recovery_unfound
Degraded data redundancy: 48/3305508 objects degraded
(0.001%), 2 pgs degraded, 2 pgs undersized
services:
mon: 3 daemons, quorum pve3,pve1,pve2 (age 2w)
mgr: pve3(active, since 2w), standbys: pve1, pve2
mds: cephfs:1 {0=pve1=up:active} 2 up:standby
osd: 25 osds: 25 up (since 5d), 25 in (since 8d); 2 remapped pgs
data:
pools: 4 pools, 672 pgs
objects: 1.10M objects, 2.9 TiB
usage: 8.6 TiB used, 12 TiB / 21 TiB avail
pgs: 48/3305508 objects degraded (0.001%)
16/1101836 objects unfound (0.001%)
669 active+clean
2 active+recovery_unfound+undersized+degraded+remapped
1 active+clean+scrubbing+deep
io:
client: 680 B/s rd, 2.6 MiB/s wr, 0 op/s rd, 151 op/s wr
I am not really concerned over lost data, since I am 99% sure it
belonged to a faulty prometheus server anyway.
The question is, how can I remove the warnings without affecting the
other objects?
Thankful for any pointers!
--
Jonathan Sélea
Website: https://jonathanselea.se
PGP Key: 0x8B35B3C894B964DD
Fingerprint: 4AF2 10DE 996B 673C 0FD8 AFA0 8B35 B3C8 94B9 64DD
Hello,
after an unexpected power outage our production cluster has 5 PGs
inactive and incomplete. The OSDs on which these 5 PGs are located all
show "stuck requests are blocked":
Reduced data availability: 5 pgs inactive, 5 pgs incomplete
98 stuck requests are blocked > 4096 sec. Implicated osds 63,80,492,494
What is the best procedure to get these PGs back? These PGs are all of
pools with a replica of 2.
Best,
Martin
Hi all,
We have a bonus Ceph Tech Talk for August. Join us August 20th at 17:00
UTC to hear Neeha Kompala and Jason Weng present on Edge Application -
Streaming Multiple Video Sources.
Don't forget on August 27th at 17:00 UTC, Pritha Srivastava will also be
presenting on this month's Ceph Tech Talk: Secure Token Service in the
Rados Gateway.
If you're interested in giving a Ceph Tech Talk for September 24th or
October 22nd, please let me know!
https://ceph.io/ceph-tech-talks/
--
Mike Perez
He/Him
Ceph Community Manager
Red Hat Los Angeles <https://www.redhat.com>
thingee(a)redhat.com <mailto:thingee@redhat.com>
M: 1-951-572-2633 <tel:1-951-572-2633> IM: IRC Freenode/OFTC: thingee
494C 5D25 2968 D361 65FB 3829 94BC D781 ADA8 8AEA
@Thingee <https://twitter.com/thingee>
<https://www.redhat.com>
<https://redhat.com/summit>
Hi...Will it be available on youtube?
On Thursday, August 20, 2020, Marc Roos <M.Roos(a)f1-outsourcing.eu> wrote:
>
> Can't join as guest without enabling mic and/or camera???
>
> -----Original Message-----
> From: Mike Perez [mailto:miperez@redhat.com]
> Sent: donderdag 20 augustus 2020 19:03
> To: ceph-users(a)ceph.io
> Subject: [ceph-users] Re: Bonus Ceph Tech Talk: Edge Application -
> Stream Multiple Video Sources
>
> And we're live! Please join us and bring questions!
>
> https://bluejeans.com/908675367
>
> On 8/17/20 11:03 AM, Mike Perez wrote:
>>
>> Hi all,
>>
>> We have a bonus Ceph Tech Talk for August. Join us August 20th at
>> 17:00 UTC to hear Neeha Kompala and Jason Weng present on Edge
>> Application - Streaming Multiple Video Sources.
>>
>> Don't forget on August 27th at 17:00 UTC, Pritha Srivastava will also
>> be presenting on this month's Ceph Tech Talk: Secure Token Service in
>> the Rados Gateway.
>>
>> If you're interested in giving a Ceph Tech Talk for September 24th or
>> October 22nd, please let me know!
>>
>> https://ceph.io/ceph-tech-talks/
>>
>> --
>>
>> Mike Perez
>>
>> He/Him
>>
>> Ceph Community Manager
>>
>> Red Hat Los Angeles <https://www.redhat.com>
>>
>> thingee(a)redhat.com <mailto:thingee@redhat.com>
>> M: 1-951-572-2633 <tel:1-951-572-2633> IM: IRC Freenode/OFTC: thingee
>>
>> 494C 5D25 2968 D361 65FB 3829 94BC D781 ADA8 8AEA
>>
>> @Thingee <https://twitter.com/thingee>
>> <https://www.redhat.com>
>> <https://redhat.com/summit>
>>
> --
>
> Mike Perez
>
> He/Him
>
> Ceph Community Manager
>
> Red Hat Los Angeles <https://www.redhat.com>
>
> thingee(a)redhat.com <mailto:thingee@redhat.com>
> M: 1-951-572-2633 <tel:1-951-572-2633> IM: IRC Freenode/OFTC: thingee
>
> 494C 5D25 2968 D361 65FB 3829 94BC D781 ADA8 8AEA
>
> @Thingee <https://twitter.com/thingee>
> <https://www.redhat.com>
> <https://redhat.com/summit>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
> email to ceph-users-leave(a)ceph.io
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
Are there any plans to add access logs to the beast frontend, in the
same way we can get with civetweb? Increasing the "debug rgw" setting
really doesn't provide the same thing.
Graham
--
Graham Allan - gta(a)umn.edu
Associate Director of Operations - Minnesota Supercomputing Institute
I still need to move from ceph disk to ceph volume. When doing this, I
wanted to also start using disk encryption. I am not really interested
in encryption offered by the hdd vendors.
Is there a best practice or advice what encryption to use ciphers/hash?
Stick to the default of CentOS7 or maybe choose what is default in
CentOS or something else? Different settings for ssd / hdd?