Copying the ML, because I forgot to reply-all.
Reed
> On Apr 15, 2020, at 3:58 PM, Reed Dier <reed.dier(a)focusvq.com> wrote:
>
> The problem is almost certainly stemming from unbalanced OSD distribution among your hosts, and assuming you are using a default 3x replication across hosts crush rule set.
>
> You are limited by your smallest bin size.
>
> In this case you have a 750GB HDD as the only OSD on node1, so when it wants 3 copies across 3 hosts, there are only ~750GB of space that can fulfill this requirement.
>
> Having lots of different size OSDs and differing OSDs in your topology is going to lead to issues of under/over utilization.
>
>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
>> -1 21.54213 root default
>> -3 0.75679 host node1
>> -5 5.39328 host node2
>> -10 15.39206 host node3
>
> You either need to redistribute your OSDs across your hosts, or possibly rethink your disk strategy.
> You could move osd.5 to node1, and osd.0 to node2, which would give you roughly 6TiB of usable hdd space across your three nodes.
>
> Reed
>
>> On Apr 15, 2020, at 10:50 AM, Simon Sutter <ssutter(a)hosttech.ch <mailto:ssutter@hosttech.ch>> wrote:
>>
>> Hello everybody,
>>
>>
>>
>> I'm very new to ceph and installed a testenvironment (nautilus).
>>
>> The current goal of this cluster is, to be a short period backup.
>>
>> For this goal we want to use older, mixed hardware, so I was thinking, for testing I will set up very unbalanced nodes (you can learn the most, from exceptional circumstances, right?).
>>
>> I created for my cephfs two pools, one for metadata and one for storage data.
>>
>>
>>
>> I have three nodes and the ceph osd tree looks like this:
>> ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
>> -1 21.54213 root default
>> -3 0.75679 host node1
>> 0 hdd 0.75679 osd.0 up 0.00636 1.00000
>> -5 5.39328 host node2
>> 1 hdd 2.66429 osd.1 up 0.65007 1.00000
>> 3 hdd 2.72899 osd.3 up 0.65007 1.00000
>> -10 15.39206 host node3
>> 5 hdd 7.27739 osd.5 up 1.00000 1.00000
>> 6 hdd 7.27739 osd.6 up 1.00000 1.00000
>> 2 ssd 0.38249 osd.2 up 1.00000 1.00000
>> 4 ssd 0.45479 osd.4 up 1.00000 1.00000
>>
>>
>> The PGs and thus the data is extremely unbalanced, you can see it in the ceph osd df overview:
>> ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
>> 0 hdd 0.75679 0.00636 775 GiB 651 GiB 650 GiB 88 KiB 1.5 GiB 124 GiB 84.02 7.26 112 up
>> 1 hdd 2.66429 0.65007 2.7 TiB 497 GiB 496 GiB 88 KiB 1.2 GiB 2.2 TiB 18.22 1.57 81 up
>> 3 hdd 2.72899 0.65007 2.7 TiB 505 GiB 504 GiB 8 KiB 1.3 GiB 2.2 TiB 18.07 1.56 88 up
>> 5 hdd 7.27739 1.00000 7.3 TiB 390 GiB 389 GiB 8 KiB 1.2 GiB 6.9 TiB 5.24 0.45 67 up
>> 6 hdd 7.27739 1.00000 7.3 TiB 467 GiB 465 GiB 64 KiB 1.3 GiB 6.8 TiB 6.26 0.54 78 up
>> 2 ssd 0.38249 1.00000 392 GiB 14 GiB 13 GiB 11 KiB 1024 MiB 377 GiB 3.68 0.32 2 up
>> 4 ssd 0.45479 1.00000 466 GiB 28 GiB 27 GiB 4 KiB 1024 MiB 438 GiB 6.03 0.52 4 up
>> TOTAL 22 TiB 2.5 TiB 2.5 TiB 273 KiB 8.4 GiB 19 TiB 11.57
>> MIN/MAX VAR: 0.32/7.26 STDDEV: 6.87
>>
>> To counteract this, I tried to turn on the balancer module.
>>
>> The module is decreasing the reweight of the osd0 more and more, while ceph pg stat is telling me, there are more misplaced objects:
>>
>> 144 pgs: 144 active+clean+remapped; 853 GiB data, 2.5 TiB used, 19 TiB / 22 TiB avail; 30 MiB/s wr, 7 op/s; 242259/655140 objects misplaced (36.978%)
>>
>>
>>
>> So my question is: is ceph supposed to do that?
>> Why are all those objects misplaced? Because of those 112 PGs on osd0?
>> Why are there 112 PGs on osd0? I did not set any pg settings except the number: 512
>>
>>
>>
>> Thank you very much
>> Simon Sutter
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io <mailto:ceph-users@ceph.io>
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io <mailto:ceph-users-leave@ceph.io>
>
Hi
After upgrading to 14.2.8 i can see that PUT operations are
significantly slower. GET and DELETE still have the same performance.
I double checked OSD nodes and I cannot find anything suspicious
there. No extreme iowaits etc.
Anyone have the same problem?
Kind regards / Pozdrawiam,
Katarzyna Myrek
Hello, I have a question. I’m trying to deploy a rados gateway and the container keeps crashing in less than a second on the host. Ceph/cephadm keeps trying to recreate it and its an endless loop. Looking at the log the gateway container is failing in trying to bind to port 80. How do I configure the settings for the gateway to change the port?
I see on the documentation page that the settings are configured via the monitor configuration but what setting exactly? I tried adding rgw_frontends to ceph.conf and with cephadm config set. I initially created the realm, group, and some just like the cephadm page stated but every time I ceph orch apply rgw ... when I check with ceph config dump I see the rgw service defaults back to port 80.
Hi,
I read at many places that when an object is deleted from ceph it is queued
for deletion with the garbage collector (GC). However, when I delete
objects of various sizes (both less than 4MB and large sized greater than
4MB i.e. MPU) I always find gc list as empty.
I tried disabling GC also to make sure that it does not run and delete it,
yet I did not find any object when I run the command. Here's the output:
*radosgw-admin gc list*
[
{
"tag": "2~KsaJkJwSGeuVzeKpkHAe_5vJ3JqZmKc",
"time": "2020-04-10 18:25:23.0.769037s",
"objs": []
}
]
NOTE: I tried to delete an object as on 14th April and 15th April.
I issued "s3cmd del" command.
Ceph Version I am using in Luminous.
Please let me know how object deletes work.
--
Regards,
Priya
Hello everybody,
I'm very new to ceph and installed a testenvironment (nautilus).
The current goal of this cluster is, to be a short period backup.
For this goal we want to use older, mixed hardware, so I was thinking, for testing I will set up very unbalanced nodes (you can learn the most, from exceptional circumstances, right?).
I created for my cephfs two pools, one for metadata and one for storage data.
I have three nodes and the ceph osd tree looks like this:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 21.54213 root default
-3 0.75679 host node1
0 hdd 0.75679 osd.0 up 0.00636 1.00000
-5 5.39328 host node2
1 hdd 2.66429 osd.1 up 0.65007 1.00000
3 hdd 2.72899 osd.3 up 0.65007 1.00000
-10 15.39206 host node3
5 hdd 7.27739 osd.5 up 1.00000 1.00000
6 hdd 7.27739 osd.6 up 1.00000 1.00000
2 ssd 0.38249 osd.2 up 1.00000 1.00000
4 ssd 0.45479 osd.4 up 1.00000 1.00000
The PGs and thus the data is extremely unbalanced, you can see it in the ceph osd df overview:
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
0 hdd 0.75679 0.00636 775 GiB 651 GiB 650 GiB 88 KiB 1.5 GiB 124 GiB 84.02 7.26 112 up
1 hdd 2.66429 0.65007 2.7 TiB 497 GiB 496 GiB 88 KiB 1.2 GiB 2.2 TiB 18.22 1.57 81 up
3 hdd 2.72899 0.65007 2.7 TiB 505 GiB 504 GiB 8 KiB 1.3 GiB 2.2 TiB 18.07 1.56 88 up
5 hdd 7.27739 1.00000 7.3 TiB 390 GiB 389 GiB 8 KiB 1.2 GiB 6.9 TiB 5.24 0.45 67 up
6 hdd 7.27739 1.00000 7.3 TiB 467 GiB 465 GiB 64 KiB 1.3 GiB 6.8 TiB 6.26 0.54 78 up
2 ssd 0.38249 1.00000 392 GiB 14 GiB 13 GiB 11 KiB 1024 MiB 377 GiB 3.68 0.32 2 up
4 ssd 0.45479 1.00000 466 GiB 28 GiB 27 GiB 4 KiB 1024 MiB 438 GiB 6.03 0.52 4 up
TOTAL 22 TiB 2.5 TiB 2.5 TiB 273 KiB 8.4 GiB 19 TiB 11.57
MIN/MAX VAR: 0.32/7.26 STDDEV: 6.87
To counteract this, I tried to turn on the balancer module.
The module is decreasing the reweight of the osd0 more and more, while ceph pg stat is telling me, there are more misplaced objects:
144 pgs: 144 active+clean+remapped; 853 GiB data, 2.5 TiB used, 19 TiB / 22 TiB avail; 30 MiB/s wr, 7 op/s; 242259/655140 objects misplaced (36.978%)
So my question is: is ceph supposed to do that?
Why are all those objects misplaced? Because of those 112 PGs on osd0?
Why are there 112 PGs on osd0? I did not set any pg settings except the number: 512
Thank you very much
Simon Sutter
Upgraded to 14.2.7, doesn't appear to have affected the behavior. As requested:
~$ ceph tell mds.mds1 heap stats
2020-02-10 16:52:44.313 7fbda2cae700 0 client.59208005
ms_handle_reset on v2:x.x.x.x:6800/3372494505
2020-02-10 16:52:44.337 7fbda3cb0700 0 client.59249562
ms_handle_reset on v2:x.x.x.x:6800/3372494505
mds.mds1 tcmalloc heap stats:------------------------------------------------
MALLOC: 50000388656 (47684.1 MiB) Bytes in use by application
MALLOC: + 0 ( 0.0 MiB) Bytes in page heap freelist
MALLOC: + 174879528 ( 166.8 MiB) Bytes in central cache freelist
MALLOC: + 14511680 ( 13.8 MiB) Bytes in transfer cache freelist
MALLOC: + 14089320 ( 13.4 MiB) Bytes in thread cache freelists
MALLOC: + 90534048 ( 86.3 MiB) Bytes in malloc metadata
MALLOC: ------------
MALLOC: = 50294403232 (47964.5 MiB) Actual memory used (physical + swap)
MALLOC: + 50987008 ( 48.6 MiB) Bytes released to OS (aka unmapped)
MALLOC: ------------
MALLOC: = 50345390240 (48013.1 MiB) Virtual address space used
MALLOC:
MALLOC: 260018 Spans in use
MALLOC: 20 Thread heaps in use
MALLOC: 8192 Tcmalloc page size
------------------------------------------------
Call ReleaseFreeMemory() to release freelist memory to the OS (via madvise()).
Bytes released to the OS take up virtual address space but no physical memory.
~$ ceph tell mds.mds1 heap release
2020-02-10 16:52:47.205 7f037eff5700 0 client.59249625
ms_handle_reset on v2:x.x.x.x:6800/3372494505
2020-02-10 16:52:47.237 7f037fff7700 0 client.59249634
ms_handle_reset on v2:x.x.x.x:6800/3372494505
mds.mds1 releasing free RAM back to system.
The pools over 15 minutes or so:
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 2045,
"bytes": 3069493686
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 2445,
"bytes": 3111162538
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 7850,
"bytes": 7658678767
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 12274,
"bytes": 11436728978
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 13747,
"bytes": 11539478519
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 14615,
"bytes": 13859676992
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 23267,
"bytes": 22290063830
}
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 44944,
"bytes": 40726959425
}
And one about a minute after the heap release showing continued growth:
~$ ceph daemon mds.mds1 dump_mempools | jq .mempool.by_pool.buffer_anon
{
"items": 50694,
"bytes": 47343942094
}
This is on a single active MDS with 2 standbys, scan for about a
million files with about 20 parallel threads on two clients, open and
read each if it exists.
On Wed, Jan 22, 2020 at 8:25 AM John Madden <jmadden.com(a)gmail.com> wrote:
>
> > Couldn't John confirm that this is the issue by checking the heap stats and triggering the release via
> >
> > ceph tell mds.mds1 heap stats
> > ceph tell mds.mds1 heap release
> >
> > (this would be much less disruptive than restarting the MDS)
>
> That was my first thought as well, but `release` doesn't appear to do
> anything in this case.
>
> John
Hi all,
We're receiving a certificate error for telemetry module:
Module 'telemetry' has failed:
HTTPSConnectionPool(host='telemetry.ceph.com', port=443): Max retries
exceeded with url: /report (Caused by SSLError(SSLError("bad handshake:
Error([('SSL routines', 'tls_process_server_certificate', 'certificate
verify failed')],)",),));
Seems certificate expired yesterday (14th april).
Cheers
Eneko
--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarragako bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es
Hello,
I have a CephFS running on v14.2.8 correctly. I also have a VM which
runs Samba as AD controller and fileserver (Zentyal). My plan was to
mount a CephFS path on that VM and make Samba share those files to a
Windows network. But I cant make the shares work as Samba is asking to
mount the CephFS resource with "user_xattr" mount option, which the
kernel driver doesnt support. Besides that, looks like I cant set CIFS
permissions on directories/files in CephFS because extended attributes
are not supported.
Is that the expected behavior or am I overlooking something?
Looking for information about this issue, I have read about ceph-vfs and
Samba CTDB. Are those options really necessary to get Samba over CephFS
with CIFS/AD domain permissions?
Thanks a lot.
Victor.