To troubleshoot with the hp printer issue, you are required to download the HP Print and Scan Doctor from its official website. After that, you have to run HPPdr.exe from the download location on your system. And then, hit on the Start link and select your printer accordingly. Now, click fix printing. If there is an issue, use HP Printer Support.
https://www.amiytech.com/hp-printer-support/
[@~]# ceph-volume lvm zap /dev/sdi
--> Zapping: /dev/sdi
--> --destroy was not specified, but zapping a whole device will remove
the partition table
stderr: wipefs: error: /dev/sdi: probing initialization failed: Device
or resource busy
--> failed to wipefs device, will try again to workaround probable race
condition
stderr: wipefs: error: /dev/sdi: probing initialization failed: Device
or resource busy
--> failed to wipefs device, will try again to workaround probable race
condition
I can see where it is busy, at least not in lsof
Hello,
recently we wanted to re-adjust rebalancing speed in one cluster with
ceph tell osd.* injectargs '--osd-max-backfills 4'
ceph tell osd.* injectargs '--osd-recovery-max-active 4'
The first osds responded and after about 6-7 osds ceph tell stopped
progressing, just after it encountered a dead osd (osd.10). We have
since then removed osd.10 and all osds in the cluster are up.
However as soon as we issue either of the above tell commands, it just
hangs. Furthermore when ceph tell hangs, pg are also becoming stuck in
"Activating" and "Peering" states.
It seems to be related, as soon as we stop ceph tell (ctrl-c it), a few
minutes later the pgs are peered/active.
We can reproduce this problem also with very busy osds, which have been
moved to another host - they also do not react to the ceph tell commands.
We are mostly on 14.2.9, besides the rgw:
[16:44:47] black2.place6:~# ceph versions
{
"mon": {
"ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)": 3
},
"mgr": {
"ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)": 3
},
"osd": {
"ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)": 85
},
"mds": {},
"rgw": {
"ceph version 20200428-923-g4004f081ec (4004f081ec047d60e84d76c2dad6f31e2ac44484) nautilus (stable)": 1
},
"overall": {
"ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)": 91,
"ceph version 20200428-923-g4004f081ec (4004f081ec047d60e84d76c2dad6f31e2ac44484) nautilus (stable)": 1
}
}
Did anyone see this before and/or do you have a hint on how to debug
ceph tell as it is not a daemon on its own?
Best regards,
Nico
--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
At least ceph thought you the essence of doing first proper testing ;)
Because if you test your use case you either get a positive or negative
result and not a problem.
However I do have to admit that ceph could be more transparent with
publishing testing and performance results. I have already discussed
this with them on such a ceph day. It does not make sense to have to do
everything yourself eg the luks overhead and putting the db/wal on ssd,
rbd performance on hdds etc. Those can quickly show if ceph can be a
candidate or not.
-----Original Message-----
From: Kevin Myers [mailto:response@ifastnet.com]
Cc: Janne Johansson; Marc Roos; ceph-devel; ceph-users
Subject: Re: [ceph-users] Re: Understanding what ceph-volume does, with
bootstrap-osd/ceph.keyring, tmpfs
Tbh ceph caused us more problems than it tried to fix ymmv good luck
> On 22 Sep 2020, at 13:04, tri(a)postix.net wrote:
>
> The key is stored in the ceph cluster config db. It can be retrieved
> by
>
> KEY=`/usr/bin/ceph --cluster ceph --name
> client.osd-lockbox.${OSD_FSID} --keyring $OSD_PATH/lockbox.keyring
> config-key get dm-crypt/osd/$OSD_FSID/luks`
>
> September 22, 2020 2:25 AM, "Janne Johansson" <icepic.dz(a)gmail.com>
wrote:
>
>> Den mån 21 sep. 2020 kl 16:15 skrev Marc Roos
<M.Roos(a)f1-outsourcing.eu>:
>>
>>> When I create a new encrypted osd with ceph volume[1]
>>>
>>> Q4: Where is this luks passphrase stored?
>>
>> I think the OSD asks the mon for it after auth:ing, so "in the mon
DBs"
>> somewhere.
>>
>> --
>> May the most significant bit of your life be positive.
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
>> email to ceph-users-leave(a)ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
> email to ceph-users-leave(a)ceph.io
Hi there,
We have 9 nodes Ceph cluster. Ceph version is 15.2.5. The cluster has 175 OSD (HDD) + 3 NVMe for cache tier for "cephfs_data" pool. CephFS pools info:
POOL ID STORED OBJECTS USED %USED MAX AVAIL
cephfs_data 1 350 TiB 179.53M 350 TiB 66.93 87 TiB
cephfs_metadata 3 3.1 TiB 17.69M 3.1 TiB 1.77 87 TiB
We use multiple active MDS instances: 3 "active" and 3 "standby". Each MDS server has 128GB RAM, "mds cache memory limit" = 64GB.
Failover to a standby MDS instance takes 10-15 hours! CephFS is unreachable for the clients all this time. The MDS instance just stays in "up:replay" state for all this time.
It looks like MDS demon checking all of the folders:
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EOpen.replay
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EMetaBlob.replay 3 dirlumps by unknown.0
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EMetaBlob.replay dir 0x300000041c5
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EMetaBlob.replay updated dir [dir 0x300000041c5 /repository/files/14/ [2,head] auth v=2070324 cv=0/0 state=1610612737|complete f(v0 m2020-09-10T13:05:29.297254-0700 515=0+515) n(v46584 rc2020-09-21T20:38:49.071043-0700 b3937793650802 1056114=601470+454644) hs=515+0,ss=0+0 dirty=75 | child=1 subtree=0 dirty=1 0x55d4c9359b80]
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EMetaBlob.replay for [2,head] had [dentry #0x1/repository/files/14/14119 [2,head] auth (dversion lock) v=2049516 ino=0x30000812e2f state=1073741824 | inodepin=1 0x55db2463a1c0]
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EMetaBlob.replay for [2,head] had [inode 0x30000812e2f [...2,head] /repository/files/14/14119/ auth fragtree_t(*^3) v2049516 f(v0 m2020-09-18T10:17:53.379121-0700 13498=0+13498) n(v6535 rc2020-09-19T05:52:25.035403-0700 b272027384385 112669=81992+30677) (iversion lock) | dirfrag=8 0x55db24643000]
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EMetaBlob.replay dir 0x30000812e2f.000*
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EMetaBlob.replay updated dir [dir 0x30000812e2f.000* /repository/files/14/14119/ [2,head] auth v=77082 cv=0/0 state=1073741824 f(v0 m2020-09-18T10:17:53.371122-0700 1636=0+1636) n(v6535 rc2020-09-19T05:51:18.063949-0700 b33321023818 13707=9986+3721) hs=885+0,ss=0+0 | child=1 0x55db845bf080]
2020-09-22T02:43:44.406-0700 7f22ae99e700 10 mds.0.journal EMetaBlob.replay added (full) [dentry #0x1/repository/files/14/14119/39823 [2,head] auth NULL (dversion lock) v=0 ino=(nil) state=1073741888|bottomlru 0x55d82061a900]
We tried standby-replay and it helps but doesn't eliminate the root cause.
We have millions folders with millions of small files. When the folders/subfolders scan is done, CephFS is active again. I believe 10 hours downtime is unexpected behaviour. Is there any way to force MDS to change status to active and run all of the required directory checks in the background? How can I localise the root cause?
I have a optimize script that I run after the reboot of a ceph node. It
sets among other things /sys/block/sdg/queue/read_ahead_kb and
/sys/block/sdg/queue/nr_requests of block devices being used for osd's.
Normally I am using the mount command to discover these but with the
tmpfs and ceph-volume this does not work.
Is anyone able to share a simple onliner getting from osd's the used
block devices? Eg from both types at the same time. Or do I really need
to traverse this block -> /dev/mapper/... ?
Hello again,
following up on the previous mail, one cluster gets rather slow at the
moment and we have spotted something "funny":
When checking ceph pg dump we see some osds have HB peers with osds that
they should not have any pg in common with.
When restarting one of the effected osds, we get the following message:
mon_cmd_maybe_osd_create fail: 'osd.12 has already bound to class
'xxx-ssd', can not reset class to 'hdd'; use 'ceph osd crush
rm-device-class <id>' to remove old class first': (16) Device or
resource busy
When checking the output of ceph osd tree, it seems to be in the correct
class:
12 xxx-ssd 0.21767 osd.12 up 1.00000 1.00000
Is it possible that the osd has "multiple" classes / that the cluster
remebers a class that was set to osd.12 when it used to be an HDD?
The output of ceph pg dump includes at the bottom this
OSD_STAT USED AVAIL USED_RAW TOTAL HB_PEERS PG_SUM PRIMARY_PG_SUM
12 150 GiB 72 GiB 151 GiB 223 GiB [3,11,13,25,36,43,54,64,71,82] 128 35
which is wrong, because osd.12 should only peer with osd.3 and osd.25,
which are the only ones in the same pool that has the replicated rule
set to match on xxx-ssd.
And the obvious question: how do we fix this?
At the moment we see around 75 pgs in peering and 39 activating,
most of them which are in a pool with slower SSDs, but it seems that
these peerings affect another pool that should have faster SSDs.
Best regards,
Nico
--
Modern, affordable, Swiss Virtual Machines. Visit www.datacenterlight.ch
I'm new on the list,
so a "Hello" to all! :-)
We're planning a Proxmox-Cluster. The data-center operator advised to
use a virtual machine with NFS on top of a single CEPH-FS instance to
mount the shared CEPH-FS storage on multiple hosts/VMs.
As this NFS/CEPH-FS-VM could be a bottle-neck I was wondering if CEPH-
FS is capable to manage concurrent access and locking itself.
Is it possible to mount CEPH-FS instances on multiple hosts (e.g. /srv)
all accessing the same data objects without data-loss or dead-locks by
concurrent access?
Will this perform better than a single NFS/CEPH-FS instance (VM)?
Thanx for any hint
Renne