Hello,
In which cases can the "mon_osd_full_ratio" and the "mon_osd_backfillfull_ratio" be exceeded ?
More specifically, in case a subset of OSDs fail, if there isn't any more space left in the remaining OSDs to migrate the PGs of the failed OSDs without exceeding either the "mon_osd_full_ratio" or the "mon_osd_backfillfull_ratio of at least one OSD, how will the cluster behave ? Will it exceed the "mon_osd_full_ratio" or the "mon_osd_backfillfull_ratio" of one or more OSDs to ensure an "active+clean" state ?
Best regards,
Raphaël
Hello,
I would like to remove cluster_network, because I'm using for in
10Gbps adapters, but for public_network I have two 25Gbps
adapters in LAG group...
I have cluster with orchestrator.
# ceph config dump
...
global advanced cluster_network 172.30.0.0/16
global advanced public_network a.b.7.0/24
mon advanced public_network a.b.7.0/24
...
How to do it safely?
Will be correct to only set:
ceph config set global cluster_network a.b.7.0/24 ?
Have I then restart mon processes and osd processes?
Many thanks for advices.
Sincerely
Jan
--
Ing. Jan Marek
University of South Bohemia
Academic Computer Centre
Phone: +420389032080
http://www.gnu.org/philosophy/no-word-attachments.cs.html
Hi, I have a Ceph cluster v16.2.10
To use STS lite, my configures are like the following:
ceph.conf
...
[client.rgw.ss-rgw-01]
host = ss-rgw-01
rgw_frontends = beast port=8080
rgw_zone=backup-hapu
admin_socket = /var/run/ceph/ceph-client.rgw.ss-rgw-01
rgw_sts_key = qekd3Rd5zXr0adQx
rgw_s3_auth_use_sts = true
$ radosgw-admin role list
{
"RoleId": "778865a0-bc7b-49d4-aed5-a952ac9d5593",
"RoleName": "backup-sts",
"Path": "/",
"Arn": "arn:aws:iam:::role/backup-sts",
"CreateDate": "2022-01-04T10:17:32.373Z",
"MaxSessionDuration": 3600,
"AssumeRolePolicyDocument": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Principal\":{\"AWS\":[\"arn:aws:iam:::user/backup-service\"]},\"Action\":[\"sts:AssumeRole\"]}]}"
},
$ radosgw-admin role policy get --role-name backup-sts --policy-name AllowAccessAllBucket
{
"Permission policy": "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Action\":[\"s3:*\"],\"Resource\":\"arn:aws:s3:::*/*\"}]}"
}
Then I use the credential of backup-service user to assume role:
sts_client = boto3.client('sts',
aws_access_key_id=access_key,
aws_secret_access_key=secret_key,
endpoint_url=endpoint_url,
region_name='backup')
response = sts_client.assume_role(
RoleArn='arn:aws:iam:::role/backup-sts',
RoleSessionName='Alice2',
DurationSeconds=3600)
s3client = boto3.client('s3',
aws_access_key_id = response['Credentials']['AccessKeyId'],
aws_secret_access_key = response['Credentials']['SecretAccessKey'],
aws_session_token = response['Credentials']['SessionToken'],
endpoint_url=endpoint_url,
region_name='backup')
response = s3client.list_buckets()
And the result is AccessDenied, but I can't figure out what I was missing
Traceback (most recent call last):
File "fff.py", line 52, in <module>
response = s3client.list_buckets()
File "/home/huynnp/.local/lib/python3.8/site-packages/botocore/client.py", line 535, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/huynnp/.local/lib/python3.8/site-packages/botocore/client.py", line 980, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListBuckets operation: Unknown
Does my configuration or code is wrong?
Thanks in advance
I was checking the tracker again and I found already fixed issue that seems to be connected with this issue.
https://tracker.ceph.com/issues/44508
Here is the PR that fixes it https://github.com/ceph/ceph/pull/33807
What I’m still not understanding is why this is only happening when using s3website api.
Is there someone who could shed some light on this?
Regards,
Ondrej
I am using ceph 17.2.6 on Rocky 8.
I have a system that started giving me large omap object warnings.
I tracked this down to a specific index shard for a single s3 bucket.
rados -p <indexpool> listomapkeys .dir.<zoneid>.bucketid.nn.shardid
shows over 3 million keys for that shard. There are only about 2
million objects in the entire bucket according to a listing of the bucket
and radosgw-admin bucket stats --bucket bucketname. No other shard
has anywhere near this many index objects. Perhaps it should be noted that this
shard is the highest numbered shard for this bucket. For a bucket with
16 shards, this is shard 15.
If I look at the list of omapkeys generated, there are *many*
beginning with "<80>0_0000", almost the entire set of the three + million
keys in the shard. These are index objects in the so-called 'ugly' namespace. The rest ofthey omapkeys appear to be normal.
The 0_0000 after the <80> indicates some sort of 'bucket log index' according to src/cls/rgw/cls_rgw.cc.
However, using some sed magic previously discussed here, I ran:
rados -p <indexpool> getomapval .dir.<zoneid>.bucketid.nn.shardid --omap-key-file /tmp/key.txt
Where /tmp/key.txt contains only the funny <80>0_0000 key name without a newline
The output of this shows, in a hex dump, the object name to which the index
refers, which was at one time a valid object.
However, that object no longer exists in the bucket, and based on expiration policy, was
previously deleted. Let's say, in the hex dump, that the object was:
foo/bar/baz/object1.bin
The prefix foo/bar/baz/ used to have 32 objects, say foo/bar/baz/{object1.bin, object2.bin, ... }
An s3api listing shows that those objects no longer exist (and that is OK, as they were previously deleted).
BUT, now, there is a weirdo object left in the bucket:
foo/bar/baz/ <- with the slash at the end, and it is an object not a PRE (fix).
All objects under foo/ have a 3 day lifecycle expiration. If I wait(at most) 3 days, the weirdo object with '/'
at the end will be deleted, or I can delete it manually using aws s3api. But either way, the log index
objects, <80>0_0000.... remain.
The bucket in question is heavily used. But with over 3 million of these <80>0_0000 objects (and growing)
in a single shard, I am currently at a loss as to what to do or how to stop this from occuring.
I've poked around at a few other buckets, and I found a few others that have this problem, but not enoughto cause a large omap warning. (A few hundred <80>0_000.... index objects in a shard), no where near enoughto cause the large omap warning that led me to this post.
Any ideas?
Hello All,
I found a weird issue with ceph_readdirplus_r() when used along
with ceph_ll_lookup_vino().
On ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy
(stable)
Any help is really appreciated.
Thanks in advance,
-Joe
Test Scenario :
A. Create a Ceph Fs Subvolume "4" and created a directory in root of
subvolume "user_root"
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23#
ceph fs subvolume ls cephfs
[
{
"name": "4"
}
]
root@ss-joe-01
(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23#
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23#
ls -l
total 0
drwxrwxrwx 2 root root 0 Sep 22 09:16 user_root
root@ss-joe-01
(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23#
B. In the "user_root" directory create some files and directories
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
mkdir dir1 dir2
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
ls
dir1 dir2
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
echo
"Hello Worldls!" > file1
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
echo "Hello Worldls!" > file2
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
ls
dir1 dir2 file1 file2
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
cat file*
Hello Worldls!
Hello Worldls!
C. Create a subvolume snapshot "sofs-4-5". Please ignore the older
snapshots.
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23#
ceph fs subvolume snapshot ls cephfs 4
[
{
"name": "sofs-4-1"
},
{
"name": "sofs-4-2"
},
{
"name": "sofs-4-3"
},
{
"name": "sofs-4-4"
},
{
"name": "sofs-4-5"
}
]
root@ss-joe-01
(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23#
Here "sofs-4-5" has snapshot id 6.
Got this from libcephfs and have verified at Line
snapshot_inode_lookup.cpp#L212. (Attached to the email)
#Content within the snapshot
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23#
cd .snap/
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/.snap#
ls
_sofs-4-1_1099511627778 _sofs-4-2_1099511627778 _sofs-4-3_1099511627778
_sofs-4-4_1099511627778 _sofs-4-5_1099511627778
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/.snap#
cd _sofs-4-5_1099511627778/
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/.snap/_sofs-4-5_1099511627778#
ls
user_root
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/.snap/_sofs-4-5_1099511627778#
cd user_root/
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/.snap/_sofs-4-5_1099511627778/user_root#
ls
dir1 dir2 file1 file2
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/.snap/_sofs-4-5_1099511627778/user_root#
cat file*
Hello Worldls!
Hello Worldls!
root@ss-joe-01
(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/.snap/_sofs-4-5_1099511627778/user_root#
D. Delete all the files and directories in "user_root"
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
rm -rf *
root@ss-joe-01(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
ls
root@ss-joe-01
(bash):/mnt/cephfs/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root#
E. Using Libcephfs in a C++ program do the following,(Attached to this
email)
1. Get the Inode of "user_root" using ceph_ll_walk().
2. Open the directory using Inode received from ceph_ll_walk() and do
ceph_readdirplus_r()
We don't see any dentries(except "." and "..") as we have deleted all
files and directories in the active filesystem. This is expected and
correct!
=================================/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root/=====================================
Path/Name
:"/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root/"
Inode Address : 0x7f5ce0009900
Inode Number : 1099511629282
Snapshot Number : 18446744073709551614
Inode Number : 1099511629282
Snapshot Number : 18446744073709551614
. Ino: 1099511629282 SnapId: 18446744073709551614 Address: 0x7f5ce0009900
.. Ino: 1099511627779 SnapId: 18446744073709551614 Address:
0x7f5ce00090f0
3. Using ceph_ll_lookup_vino() get the Inode * of "user_root" for
snapshot 6, Here "sofs-4-5" has snapshot id 6.
Got this from libcephfs and have verified at Line
snapshot_inode_lookup.cpp#L212. (Attached to the email
4. Open the directory using Inode * received from ceph_ll_lookup_vino()
and do ceph_readdirplus_r()
We don't see any dentries (except "." and "..") This is NOT expected and
NOT correct, as there are files and directories in the snaphot 6.
=================================1099511629282:6=====================================
Path/Name :"1099511629282:6"
Inode Address : 0x7f5ce000a110
Inode Number : 1099511629282
Snapshot Number : 6
Inode Number : 1099511629282
Snapshot Number : 6
. Ino: 1099511629282 SnapId: 6 Address: 0x7f5ce000a110
.. Ino: 1099511629282 SnapId: 6 Address: 0x7f5ce000a110
5. Get the Inode of "user_root/ .snap/_sofs-4-5_1099511627778 / " using
ceph_ll_walk().
6. Open the directory using Inode received from ceph_ll_walk() and do
ceph_readdirplus_r()
We see ALL dentries of all files and directories in the snapshot. This
is expected and correct!
=================================/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root/.snap/_sofs-4-5_1099511627778/=====================================
Path/Name
:"/volumes/_nogroup/4/f0fae76f-196d-4ebd-b8d0-528985505b23/user_root/.snap/_sofs-4-5_1099511627778/"
Inode Address : 0x7f5ce000a110
Inode Number : 1099511629282
Snapshot Number : 6
Inode Number : 1099511629282
Snapshot Number : 6
. Ino: 1099511629282 SnapId: 6 Address: 0x7f5ce000a110
.. Ino: 1099511629282 SnapId: 18446744073709551615 Address:
0x5630ab946340
file1 Ino: 1099511628291 SnapId: 6 Address: 0x7f5ce000aa90
dir1 Ino: 1099511628289 SnapId: 6 Address: 0x7f5ce000b180
dir2 Ino: 1099511628290 SnapId: 6 Address: 0x7f5ce000b800
file2 Ino: 1099511628292 SnapId: 6 Address: 0x7f5ce000be80
7. Now Again using ceph_ll_lookup_vino() get the Inode * of "user_root"
for snapshot 6, Here "sofs-4-5" has snapshot id 6.
8. Open the directory using Inode * received from
ceph_ll_lookup_vino() and do ceph_readdirplus_r()
Now! we see all the files and Directories in the snapshot!
=================================1099511629282:6=====================================
Path/Name :"1099511629282:6"
Inode Address : 0x7f5ce000a110
Inode Number : 1099511629282
Snapshot Number : 6
Inode Number : 1099511629282
Snapshot Number : 6
. Ino: 1099511629282 SnapId: 6 Address: 0x7f5ce000a110
.. Ino: 1099511629282 SnapId: 18446744073709551615 Address:
0x5630ab946340
file1 Ino: 1099511628291 SnapId: 6 Address: 0x7f5ce000aa90
dir1 Ino: 1099511628289 SnapId: 6 Address: 0x7f5ce000b180
dir2 Ino: 1099511628290 SnapId: 6 Address: 0x7f5ce000b800
file2 Ino: 1099511628292 SnapId: 6 Address: 0x7f5ce000be80
Am I missing something using these APIs?
File attached to this email
Full out of the program attached to the email.
- snapshot_inode_lookup.cpp_output.txt <Attached>
C++ Program - snapshot_inode_lookup.cpp <Attached>
/etc/ceph/ceph.conf - <attached>
Ceph Client Log during the run of this C++ program - client.log<attached>
Compile Command:
g++ -o snapshot_inode_lookup ./snapshot_inode_lookup.cpp -g -ldl -ldw
-lcephfs -lboost_filesystem --std=c++17
Linux Details,
root@ss-joe-01(bash):/home/hydrauser# uname -a
Linux ss-joe-01 5.10.0-23-amd64 #1 SMP Debian 5.10.179-1 (2023-05-12)
x86_64 GNU/Linux
root@ss-joe-01(bash):/home/hydrauser#
Ceph Details,
root@ss-joe-01(bash):/home/hydrauser# ceph -v
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy
(stable)
root@ss-joe-01(bash):/home/hydrauser#
root@ss-joe-01(bash):/home/hydrauser# ceph -s
cluster:
id: fb43d857-d165-4189-87fc-cf1debce9170
health: HEALTH_OK
services:
mon: 3 daemons, quorum ss-joe-01,ss-joe-02,ss-joe-03 (age 4d)
mgr: ss-joe-01(active, since 4d), standbys: ss-joe-03, ss-joe-02
mds: 1/1 daemons up
osd: 3 osds: 3 up (since 4d), 3 in (since 4d)
data:
volumes: 1/1 healthy
pools: 3 pools, 49 pgs
objects: 39 objects, 1.0 MiB
usage: 96 MiB used, 30 GiB / 30 GiB avail
pgs: 49 active+clean
root@ss-joe-01(bash):/home/hydrauser#
root@ss-joe-01(bash):/home/hydrauser# dpkg -l | grep ceph
ii ceph 17.2.5-1~bpo11+1
amd64 distributed storage and file system
ii ceph-base 17.2.5-1~bpo11+1
amd64 common ceph daemon libraries and management tools
ii ceph-base-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for ceph-base
ii ceph-common 17.2.5-1~bpo11+1
amd64 common utilities to mount and interact with a ceph
storage cluster
ii ceph-common-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for ceph-common
ii ceph-fuse 17.2.5-1~bpo11+1
amd64 FUSE-based client for the Ceph distributed file
system
ii ceph-fuse-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for ceph-fuse
ii ceph-mds 17.2.5-1~bpo11+1
amd64 metadata server for the ceph distributed file system
ii ceph-mds-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for ceph-mds
ii ceph-mgr 17.2.5-1~bpo11+1
amd64 manager for the ceph distributed storage system
ii ceph-mgr-cephadm 17.2.5-1~bpo11+1
all cephadm orchestrator module for ceph-mgr
ii ceph-mgr-dashboard 17.2.5-1~bpo11+1
all dashboard module for ceph-mgr
ii ceph-mgr-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for ceph-mgr
ii ceph-mgr-diskprediction-local 17.2.5-1~bpo11+1
all diskprediction-local module for ceph-mgr
ii ceph-mgr-k8sevents 17.2.5-1~bpo11+1
all kubernetes events module for ceph-mgr
ii ceph-mgr-modules-core 17.2.5-1~bpo11+1
all ceph manager modules which are always enabled
ii ceph-mon 17.2.5-1~bpo11+1
amd64 monitor server for the ceph storage system
ii ceph-mon-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for ceph-mon
ii ceph-osd 17.2.5-1~bpo11+1
amd64 OSD server for the ceph storage system
ii ceph-osd-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for ceph-osd
ii ceph-volume 17.2.5-1~bpo11+1
all tool to facilidate OSD deployment
ii cephadm 17.2.5-1~bpo11+1
amd64 cephadm utility to bootstrap ceph daemons with
systemd and containers
ii libcephfs2 17.2.5-1~bpo11+1
amd64 Ceph distributed file system client library
ii libcephfs2-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for libcephfs2
ii libsqlite3-mod-ceph 17.2.5-1~bpo11+1
amd64 SQLite3 VFS for Ceph
ii libsqlite3-mod-ceph-dbg 17.2.5-1~bpo11+1
amd64 debugging symbols for libsqlite3-mod-ceph
ii python3-ceph-argparse 17.2.5-1~bpo11+1
all Python 3 utility libraries for Ceph CLI
ii python3-ceph-common 17.2.5-1~bpo11+1
all Python 3 utility libraries for Ceph
ii python3-cephfs 17.2.5-1~bpo11+1
amd64 Python 3 libraries for the Ceph libcephfs library
ii python3-cephfs-dbg 17.2.5-1~bpo11+1
amd64 Python 3 libraries for the Ceph libcephfs library
root@ss-joe-01(bash):/home/hydrauser#
Hi,
is it possible to use one cephx key for multiple parallel running RGW?
Maybe I could just use the same 'name' and the same key for all of the RGW
instances?
I plan to start RGWs all over the place in container and let BGP handle the
traffic. But I don't know how to create on demand keys, that get removed
when the RGW shuts down.
I don't want to use the orchestrator for this, because I would need to add
all the compute nodes to it and there might be other processes in place
that add FW rules in our provisioning.
Cheers
Boris
Hi,
A question to avoid using a to elaborate method in finding de most recent snapshot of a RBD-image.
So, what would be the preferred way to find the latest snapshot of this image?
root@hvs001:/# rbd snap ls libvirt-pool/CmsrvDOM2-MULTIMEDIA
SNAPID NAME SIZE PROTECTED TIMESTAMP
223 snap_5 435 GiB yes Fri Sep 15 15:33:39 2023
262 snap_1 435 GiB yes Mon Sep 18 15:39:36 2023
280 snap_3 435 GiB yes Wed Sep 20 15:39:42 2023
I would tend to select the highest snapid. But at some point, the next snapid will restart at 1? So maybe not the best idea.
I could select by date/time but I don't have an easy way to convert the text string to an timestamp...
I've looked in the rbd help snap ls, there seem to be now way to sort, format the output of the timestamp,...
Any advice will be appreciated.
Greetings,
Dominique.
Hi Community,
I recently proposed a new authorization mechanism for RGW that can let the
RGW daemon ask an external service to authorize a request based on AWS S3
IAM tags (that means the external service would receive the same env as an
IAM policy doc would have to evaluate the policy).
You can find the documentation of the implementation here:
https://github.com/clwluvw/ceph/blob/rgw-external-iam/doc/radosgw/external-…
And the PR here: https://github.com/ceph/ceph/pull/53345
We would love to hear feedback if anyone else feels this would be a need
for them and what you would think about the APIs.
Best,