Hi all,
I have 3 nautilus clusters. They started out as mimic but were recently
upgraded to nautilus. On two of them dynamic bucket index sharding seems
to work automatically. However on one of the clusters it doesn;t and I
have no clue why
* If I execute radosgw-admin bucket limit check I find some buckets over
the default limit
{
"bucket": "acc-dump",
"tenant": "acc",
"num_objects": 137290,
"num_shards": 0,
"objects_per_shard": 137290,
"fill_status": "OVER 100.000000%"
}
* the config value for rgw_dynamic_resharding is true
* on: 'radosgw-admin reshard stale-instances list' I get an error:
Resharding disabled in a multisite env, stale instances unlikely from
resharding
These instances may not be safe to delete.
Use --yes-i-really-mean-it to force displaying these instances.
(This does not happen on the other two clusters) However this is a single
site cluster
radosgw-admin realm list
{
"default_info": "e724bd71-31eb-45c8-a456-151f6a5aa8b5",
"realms": [
"intern"
]
}
radosgw-admin zone list
{
"default_info": "7f9bebd6-a9cf-4006-83b1-ff99391aacc0",
"zones": [
"dc3-intern"
]
}
radosgw-admin zonegroup list
{
"default_info": "ce4329ae-2bc8-4117-9b82-271022b223fa",
"zonegroups": [
"dc3"
]
}
* on: 'radosgw-admin reshard list' I get loads of errors
[2020-05-06 10:08:49.261 7f1790b607c0 -1 ERROR: failed to list reshard log
entries, oid=reshard.0000000000
2020-05-06 10:08:49.265 7f1790b607c0 -1 ERROR: failed to list reshard log
entries, oid=reshard.0000000001
etc
I've gone through the documentation and some blogs but I have no clue how
to tackle this problem. Any help would be much appreciated
Thanks
Marcel
Hi all,
Ceph documentation mentions it has two types of tests: *unit tests* (also
called make check tests) and *integration tests*. Strictly speaking, the *make
check tests* are not “unit tests”, but rather tests that can be run easily
on a single build machine after compiling Ceph from source .
unit tests: https://github.com/ceph/ceph/tree/master/src/test
In order to develop on ceph, I am using a Ceph utility, *vstart.sh*, which
allows me to deploy fake local cluster for development purpose. I am doing
unit testing. And these tests are helping me. Thanks !
My question: How real and big is the workload of unit tests? Are these
tests enough for profiling function calls count, loop counts,
parallelism to a good extent?
Thanks in advance !
BR
Bobby !
Hi,
I just deployed a new cluster with cephadm instead of ceph-deploy. In tyhe past, If i change ceph.conf for tweaking, i was able to copy them and apply to all servers. But i cannot find this on new cephadm tool.
I did few changes on ceph.conf but ceph is unaware of those changes. How can i apply them? I've used it with docker.
Thanks,
Gencer.
Hi Eric,
Would it be possible to use it with an older cluster version (like
running new radosgw-admin in the container, connecting to the cluster
on 14.2.X)?
Kind regards / Pozdrawiam,
Katarzyna Myrek
czw., 16 kwi 2020 o 19:58 EDH - Manuel Rios
<mriosfer(a)easydatahost.com> napisał(a):
>
> Hi Eric,
>
>
>
> Are there any ETA for get those script backported maybe in 14.2.10?
>
>
>
> Regards
>
> Manuel
>
>
>
>
>
> De: Eric Ivancich <ivancich(a)redhat.com>
> Enviado el: jueves, 16 de abril de 2020 19:05
> Para: Katarzyna Myrek <katarzyna(a)myrek.pl>; EDH - Manuel Rios <mriosfer(a)easydatahost.com>
> CC: ceph-users(a)ceph.io
> Asunto: Re: [ceph-users] RGW and the orphans
>
>
>
> There is currently a PR for an “orphans list” capability. I’m currently working on the testing side to make sure it’s part of our teuthology suite.
>
>
>
> See: https://github.com/ceph/ceph/pull/34148
>
>
>
> Eric
>
>
>
>
>
> On Apr 16, 2020, at 9:26 AM, Katarzyna Myrek <katarzyna(a)myrek.pl> wrote:
>
>
>
> Hi
>
> Thanks for the quick response.
>
> To be honest my cluster is getting full because of that trash and I am
> at the point where I have to do the removal manually ;/.
>
> Kind regards / Pozdrawiam,
> Katarzyna Myrek
>
> czw., 16 kwi 2020 o 13:09 EDH - Manuel Rios
> <mriosfer(a)easydatahost.com> napisał(a):
>
>
> Hi,
>
> From my experience orphans find didn't work since several releases ago, and command should be re-coded or deprecated because its not running.
>
> Im our cases it loops over generated shards until RGW daemon crash.
>
> Interested into this post, in our case orphans find takes more than 24 hours into start loop over shards, but never pass the shard 0 or 1.
>
> CEPH RGW devs, should provide any workaround script/ new tool or something to maintain our rgw clusters. Because with the last bugs all rgw cluster got a ton of trash, wasting resources and money.
>
> And manual cleaning is not trivial and easy.
>
> Waiting for more info,
>
> Manuel
>
>
> -----Mensaje original-----
> De: Katarzyna Myrek <katarzyna(a)myrek.pl>
> Enviado el: jueves, 16 de abril de 2020 12:38
> Para: ceph-users(a)ceph.io
> Asunto: [ceph-users] RGW and the orphans
>
> Hi
>
> Is there any new way to find and remove orphans from RGW pools on Nautilus? I have found info that "orphans find" is now deprecated?
>
> I can see that I have tons of orphans in one of our clusters. Was wondering how to safely remove them - make sure that they are really orphans.
> Does anyone have a good method for that?
>
> My cluster mostly has orphans from multipart uploads.
>
>
> Kind regards / Pozdrawiam,
> Katarzyna Myrek
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
>
Hi Cephers,
I am working on Ceph librados. Currently I can test sequential/synchronous
read and write tests both in C and C++. However I am struggling with
asynchronous/non-sequential test codes. Are there any test repositories
which contain asynchronous/non-sequential examples codes?
Thanks in advance
Bobby !
Hello,
I wanted to know if rbd will flush any writes in the page cache when a
volume is "unmap"ed on the host, of if we need to flush explicitly using
"sync" before unmap?
Thanks,
Shridhar
Hello All,
One of the use cases (e.g. machine learning workloads) for RBD volumes in
our production environment is that, users could mount an RBD volume in RW
mode in a container, write some data to it and later use the same volume in
RO mode into a number of containers in parallel to consume the data.
I am trying to test this scenario with different file systems (ext3/4 and
xfs). I have an automated test code that creates a volume, maps it to a
node, mounts in RW mode and write some data into it. Later the same volume
is mounted in RO mode in a number of other nodes and a process reads from
the file.
I dont see any issues with ext3 or 4 filesystems, but with XFS, I notice
that 1 or 2 (out of 6) parallel read-only mounts fail with "Structure needs
cleaning" error. What is surprising is that, the rest of 4 or 5 mounts will
be successful and I dont see any I/O issues on those - which suggests that
there shouldn't be any corruptions on the volume itself. Also note that
there is no other process writing to the volume at this time so no chance
of corruption that way.
I am doing xfs mounts with "ro,nouuid" mount options.
Any inputs on why I may be seeing this issue randomly?
Regards,
Shridhar
I activated autoscaler on all my pools but found this error msg when it came to the cache_pool.
ceph-mgr.pve22.log.3:2020-05-01 23:59:24.014 7f5120eda700 0 mgr[pg_autoscaler] pg_num adjustment on cache_pool to 512 failed: (-1, '', 'splits in cache pools must be followed by scrubs and leave sufficient free space to avoid overfilling. use --yes-i-really-mean-it to force.')
I ended up clearing the cache pool by limiting it to max 1 object and increasing the pg_num/pgp_num. Now I expect the scrub to being every 24h automatically.
Should this be included in the docs? Is this a bug of pg_autoscaler ?
BR,
Alex
Hi all,
I'm planning to upgrade one on my Ceph Cluster currently on Luminous
12.2.13 / Debian Stretch (updated).
On this cluster, Luminous is packaged from the official Ceph repo (deb
https://download.ceph.com/debian-luminous/ stretch main)
I would like to upgrade it with Debian Buster and Nautilus using the
croit.io repository (deb https://mirror.croit.io/debian-nautilus/ buster
main)
I already prepared the steps procedure but I just want to verify one
step regarding the upgrade of the ceph packages.
Do I have to upgrade ceph in the same time than Debian or do i have to
upgrade ceph after the Debian upgrade from Stretch to Buster ?
1) In the first case :
* Replace stretch by buster in /etc/apt/sources.list
* Modify the ceph.list repo by croit.io one
* Upgrade the entire nodes
2) In the second case (upgrade Debian then Ceph)
* Replace stretch by buster in /etc/apt/sources.list
* keep the /etc/apt/sources.list.d/ceph.list as it is
* Upgrade and reboot the nodes
* replace the ceph.list file by croit.io
* upgrade the ceph packages
* restarting the Ceph services (in the right order MON -> MGR -> OSD
-> MDS)
Thanks a lot for your advices
Regards,
Hervé