Hi,
running 14.2.6, debian buster (backports).
Have set up a cephfs with 3 data pools and one metadata pool:
myfs_data, myfs_data_hdd, myfs_data_ssd, and myfs_metadata.
The data of all files are with the use of ceph.dir.layout.pool either
stored in the pools myfs_data_hdd or myfs_data_ssd. This has also been
checked by dumping the ceph.file.layout.pool attributes of all files.
The filesystem has 1617949 files and 36042 directories.
There are however approximately as many objects in the first pool created
for the cephfs, myfs_data, as there are files. They also becomes more or
fewer as files are created or deleted (so cannot be some leftover from
earlier exercises). Note how the USED size is reported as 0 bytes,
correctly reflecting that no file data is stored in them.
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR USED COMPR UNDER COMPR
myfs_data 0 B 1618229 0 4854687 0 0 0 2263590 129 GiB 23312479 124 GiB 0 B 0 B
myfs_data_hdd 831 GiB 136309 0 408927 0 0 0 106046 200 GiB 269084 277 GiB 0 B 0 B
myfs_data_ssd 43 GiB 1552412 0 4657236 0 0 0 181468 2.3 GiB 4661935 12 GiB 0 B 0 B
myfs_metadata 1.2 GiB 36096 0 108288 0 0 0 4828623 82 GiB 1355102 143 GiB 0 B 0 B
Is this expected?
I was assuming that in this scenario, all objects, both their data and any
keys would be either in the metadata pool, or the two pools where the
objects are stored.
Is it some additional metadata keys that are stored in the first
created data pool for cephfs? This would not be so nice in case the osd
selection rules for it are using worse disks than the data itself...
Btw: is there any tool to see the amount of key value data size associated
with a pool? 'ceph osd df' gives omap and meta for osds, but not broken
down per pool.
Best regards,
Håkan
Hi, I am new on rgw and try deploying a mutisite cluster in order to sync
data from one cluster to another.
My source zone is the default zone in the default zonegroup, structure as
belows:
realm: big-realm
|
zonegroup: default
/ \
master zone: default secondary zone: backup
*STEP*:
on source cluster:
1. radosgw-admin realm create --rgw-realm=big-realm --default
2. radosgw-admin zonegroup modify --rgw-realm big-realm --rgw-zonegroup
default --master --endpoints "http://172.24.29.26:7480"
3. radosgw-admin zone modify --rgw-zonegroup default --rgw-zone default
--master --endpoints "http://172.24.29.26:7480"
4. radosgw-admin user create --uid=sync-user
--display-name="Synchronization User" --access-key=redhat --secret=redhat
--system
5. radosgw-admin zone modify --rgw-zone=default --access-key=redhat
--secret=redhat
6. radosgw-admin period update --commit
on destination cluster:
1. radosgw-admin realm pull --url="http://172.24.29.26:7480
<http://172.24.29.26/>" --access-key=redhat --secret=redhat
--rgw-realm=big-realm
2. radosgw-admin realm default --rgw-realm=big-realm
3. radosgw-admin period pull --url="http://172.24.29.26:7480"
--access-key=redhat --secret=redhat
4. radosgw-admin zonegroup default --rgw-zonegroup=default
5. radosgw-admin zone create --rgw-zonegroup=default --rgw-zone=backup
--endpoints="http://172.24.29.29:7480" --access-key=redhat --secret=redhat
--default
6. radosgw-admin period update --commit
commit period on secondary zone get error:
2020-04-02 14:36:04.707 7fd8ee9376c0 1 Cannot find zone
id=8c75360a-c0cf-4772-b85e-ff74726396c2 (name=backup), switching to local
zonegroup configuration
Sending period to new master zone 5fba7cae-47f1-4c8e-9a34-1b499c9c27f8
request failed: (2202) Unknown error 2202
failed to commit period: (2202) Unknown error 2202
radosgw-admin sync status:
2020-04-02 14:37:18.330 7f27c60676c0 1 Cannot find zone
id=8c75360a-c0cf-4772-b85e-ff74726396c2 (name=backup), switching to local
zonegroup configuration
realm fec73799-36be-4418-abb2-9804cc83d83d (big-realm)
zonegroup fc61ac2f-dc1d-421b-90af-ffe9113b9935 (default)
zone 8c75360a-c0cf-4772-b85e-ff74726396c2 (backup)
metadata sync failed to read sync status: (2) No such file or directory
data sync source: 5fba7cae-47f1-4c8e-9a34-1b499c9c27f8 (default)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source
My source cluster's version is 13.2.8, and destination cluster's is 14.2.8.
I tried sync data from cluster of both 13.2.8 version and got the same
error.
Is there any step I was wrong or the default zone cannot be synced?
Thanks
hi,every one,
my ceph version 12.2.12,I want to set require min compat client
luminous,I use command
#ceph osd set-require-min-compat-client luminous
but ceph report:Error EPERM: cannot set require_min_compat_client to
luminous: 4 connected client(s) look like jewel (missing
0xa00000000200000); add --yes-i-really-mean-it to do it anyway
[root@node-1 ~]# ceph features
{
"mon": {
"group": {
"features": "0x3ffddff8eeacfffb",
"release": "luminous",
"num": 3
}
},
"osd": {
"group": {
"features": "0x3ffddff8eeacfffb",
"release": "luminous",
"num": 15
}
},
"client": {
"group": {
"features": "0x40106b84a842a52",
"release": "jewel",
"num": 4
},
"group": {
"features": "0x3ffddff8eeacfffb",
"release": "luminous",
"num": 168
}
}
}
so,I run command:
[root@node-1 gyt]# ceph osd set-require-min-compat-client luminous
--yes-i-really-mean-it
set require_min_compat_client to luminous
but now,I want to set require min compat client jewel,I use command:
[root@node-1 gyt]# ceph osd set-require-min-compat-client jewel
Error EPERM: osdmap current utilizes features that require luminous;
cannot set require_min_compat_client below that to jewel
what‘s the way we are set luminous chang to jewel?
Hello,
I was hoping someone could clear up the difference between these metrics.
In filestore the difference between Apply and Commit Latency was pretty
clear and these metrics gave a good representation of how the cluster was
performing. High commit usually meant our journals were performing poorly
while high apply pointed to an OSD issue.
With bluestore Apply & Commit are now tied to the same metric and it's not
as clear to me what that metric is.
In addition new metrics such as Read and Write Op Latency have been added.
I'm led to believe that these are similar to what Apply Latency used to
represent but is that actually the case?
If anyone who has a better understanding of this than I do can enlighten me
I'd appreciate it!
Thanks,
John
Hi.
I have a cluster that has been running for close to 2 years now - pretty
much with the same setting, but over the past day I'm seeing this warning.
(and the cache seem to keep growing) - Can I figure out which clients is
accumulating the inodes?
Ceph 12.2.8 - is it ok just to "bump" the memory to say 128GB - any
negative sideeffects?
jk@ceph-mon1:~$ sudo ceph health detail
HEALTH_WARN 1 MDSs report oversized cache; 3 clients failing to respond to
cache pressure
MDS_CACHE_OVERSIZED 1 MDSs report oversized cache
mdsceph-mds1(mds.0): MDS cache is too large (91GB/32GB); 34400070
inodes in use by clients, 3293 stray files
Thanks - Jesper
Hello,
is there anything else needed beside running:
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-${OSD}
bluefs-bdev-new-db --dev-target /dev/vgroup/lvdb-1
I did so some weeks ago and currently i'm seeing that all osds
originally deployed with --block-db show 10-20% I/O waits while all
those got converted using ceph-bluestore-tool show 80-100% I/O waits.
Also is there some tuning available to use more of the SSD? The SSD
(block-db) is only saturated at 0-2%.
Greets,
Stefan
Hi all,
I read in some release notes it is recommended to have your default data
pool replicated and use erasure coded pools as additional pools through
layouts. We have still a cephfs with +-1PB usage with a EC default pool.
Is there a way to change the default pool or some other kind of
migration without having to recreate the FS?
Thanks!
Kenneth
Hi,
===
NOTE: I do not see my thread in ceph-list for some reason. I don't know if list received my question or not. So, sorry if this is duplicate.
===
I just deployed a new cluster with cephadm instead of ceph-deploy. In tyhe past, If i change ceph.conf for tweaking, i was able to copy them and apply to all servers. But i cannot find this on new cephadm tool. I did few changes on ceph.conf but ceph is unaware of those changes. How can i apply them? I've used it with docker. Thanks, Gencer.
Quick question Ceph guru's.
For a 1.1PB raw cephfs system currently storing 191TB of data and 390 million objects (mostly small Python, ML training files etc.) how many MDS servers should I be running?
System is Nautilus 14.2.8.
I ask because up to know I have run one MDS with one standby-replay and occasionally it blows up with large memory consumption, 60Gb+ even though I have mds_cache_memory_limit = 32G and that was 16G until recently. It of course tries to restart on another MDS node fails again and after several attempts usually comes back up. Today I increased to two active MDS's but the question is what is the optimal number for a pretty active system? The single MDS seemed to regularly run around 1400 req/s and I often get up to six clients failing to respond to cache pressure.
The current setup is:
ceph fs status
cephfs - 71 clients
======
+------+----------------+--------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+----------------+--------+---------------+-------+-------+
| 0 | active | a | Reqs: 447 /s | 12.0M | 11.9M |
| 1 | active | b | Reqs: 154 /s | 1749k | 1686k |
| 1-s | standby-replay | c | Evts: 136 /s | 1440k | 1423k |
| 0-s | standby-replay | d | Evts: 402 /s | 16.8k | 298 |
+------+----------------+--------+---------------+-------+-------+
+-----------------+----------+-------+-------+
| Pool | type | used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 160G | 169G |
| cephfs_data | data | 574T | 140T |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
| w |
| x |
| y |
| z |
+-------------+
MDS version: ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)
Regards.
Robert Ruge
Systems & Network Manager
Faculty of Science, Engineering & Built Environment
[cid:image001.png@01D36789.04BE09A0]
Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone.
Deakin University does not warrant that this email and any attachments are error or virus free.
This is the second time this happened in a couple of weeks. The MDS locks
up and the stand-by can't take over so the Montiors black list them. I try
to unblack list them, but they still say this in the logs
mds.0.1184394 waiting for osdmap 234947 (which blacklists prior instance)
Looking at a pg dump, it looks like the epoch is passed that.
$ ceph pg map 3.756
osdmap e234953 pg 3.756 (3.756) -> up [113,180,115] acting [113,180,115]
Last time, it seemed to just recover after about an hour all by it's self.
Any way to speed this up?
Thank you,
Robert LeBlanc
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1