Hi,
After upgrading to Ceph 16.2.14 we had several OSD crashes
in bstore_kv_sync thread:
1. "assert_thread_name": "bstore_kv_sync",
2. "backtrace": [
3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]",
4. "gsignal()",
5. "abort()",
6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x1a9) [0x564dc5f87d0b]",
7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t
const&)+0x15e) [0x564dc6604a9e]",
9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned
long)+0x77d) [0x564dc66951cd]",
10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90)
[0x564dc6695670]",
11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]",
12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]",
13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]",
14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
[0x564dc6c761c2]",
15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]",
16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned
long)+0x309) [0x564dc6b780c9]",
17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned
long, bool, unsigned long*, unsigned long,
rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]",
18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&,
rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84) [0x564dc6b1f644]",
20. "(RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a)
[0x564dc6b2004a]",
21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]",
22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]",
23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]",
24. "clone()"
25. ],
I am attaching two instances of crash info for further reference:
https://pastebin.com/E6myaHNU
OSD configuration is rather simple and close to default:
osd.6 dev bluestore_cache_size_hdd 4294967296
osd.6 dev
bluestore_cache_size_ssd 4294967296
osd advanced debug_rocksdb
1/5 osd
advanced osd_max_backfills 2
osd basic
osd_memory_target 17179869184
osd advanced osd_recovery_max_active
2 osd
advanced osd_scrub_sleep 0.100000
osd advanced
rbd_balance_parent_reads false
debug_rocksdb is a recent change, otherwise this configuration has been
running without issues for months. The crashes happened on two different
hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD
block) don't exhibit any issues. We have not experienced such crashes with
Ceph < 16.2.14.
Is this a known issue, or should I open a bug report?
Best regards,
Zakhar
Hi all,
I have some troubles with my backup script because there are few files, in a deep sub-directory, with a creation/modification date in the future (for example: 2040-02-06 18:00:00). As my script uses the ceph.dir.rctime extended attribute to identify the files and directories to backup, it now browses and sync a lot of unchanged sub-directories…
I tried a lot of things, including remove and recreate the files so they have now the current datetime, but rctime is never updated. Even when I remove the last directory (ie: the one where the files are located), rctime is not updated for the parent directories.
Has someone a trick to reset rctime to current datetime (or any other solution to remove this inconsistent value of rctime) with quincy (17.2.6) ?
Regards,
Arnaud
Greetings -
Forgive me if this is an elementary question - am fairly new to running CEPH. Have searched but didn’t see anything specific that came up.
Is there any way to disable the disk space warnings (CephNodeDiskspaceWarning) for specific drives or filesystems on my CEPH servers?
Running 18.2.0, installed with cephadm on Ubuntu 22.04 on Arm. Keep seeing these warnings in the Dashboard for /boot/firmware, which, in my opinion shouldn’t really be something that ceph needs to worry about - or at least, should be something I can configure it to ignore.
Thanks in advance.
Dan.
Hi Everyone,
My company is dealing with quite large Ceph cluster (>10k OSDs, >60 PB of data). It is entirely dedicated to object storage with S3 interface. Maintenance and its extension are getting more and more problematic and time consuming. We consider to split it to two or more completely separate clusters (without replication of data among them) and create S3 layer of abstraction with some additional metadata that will allow us to use these 2+ physically independent instances as a one logical cluster. Additionally, newest data is the most demanded data, so we have to spread it equally among clusters to avoid skews in cluster load.
Do you have any similar experience? How did you handle it? Maybe you have some advice? I'm not a Ceph expert. I'm just a Ceph's user and software developer who does not like to duplicate someone's job.
Best,
Paweł
Hello,
we removed an SSD cache tier and its pool.
The PGs for the pool do still exist.
The cluster is healthy.
The PGs are empty and they reside on the cache tier pool's SSDs.
We like to take out the disks but it is not possible. The cluster sees
the PGs and answers with a HEALTH_WARN.
Because of the replication of three there are still 128 PGs on three of
the 24 OSDs. We were able to remove the other OSDs.
Summary:
- pool removed
- 3 x 128 empty PGs still exist
- 3 of 24 OSDs still exist
How is it possible to remove these empty and healthy PGs?
The only way I found was something like:
ceph pg {pg-id} mark_unfound_lost delete
Is that the right way?
Some output of:
ceph pg ls-by-osd 23
PG OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES*
OMAP_KEYS* LOG STATE SINCE VERSION REPORTED
UP ACTING SCRUB_STAMP
DEEP_SCRUB_STAMP
3.0 0 0 0 0 0 0
0 0 active+clean 27h 0'0 2627265:196316
[15,6,23]p15 [15,6,23]p15 2023-09-28T12:41:52.982955+0200
2023-09-27T06:48:23.265838+0200
3.1 0 0 0 0 0 0
0 0 active+clean 9h 0'0 2627266:19330
[6,23,15]p6 [6,23,15]p6 2023-09-29T06:30:57.630016+0200
2023-09-27T22:58:21.992451+0200
3.2 0 0 0 0 0 0
0 0 active+clean 2h 0'0 2627265:1135185
[23,15,6]p23 [23,15,6]p23 2023-09-29T13:42:07.346658+0200
2023-09-24T14:31:52.844427+0200
3.3 0 0 0 0 0 0
0 0 active+clean 13h 0'0 2627266:193170
[6,15,23]p6 [6,15,23]p6 2023-09-29T01:56:54.517337+0200
2023-09-27T17:47:24.961279+0200
3.4 0 0 0 0 0 0
0 0 active+clean 14h 0'0 2627265:2343551
[23,6,15]p23 [23,6,15]p23 2023-09-29T00:47:47.548860+0200
2023-09-25T09:39:51.259304+0200
3.5 0 0 0 0 0 0
0 0 active+clean 2h 0'0 2627265:194111
[15,6,23]p15 [15,6,23]p15 2023-09-29T13:28:48.879959+0200
2023-09-26T15:35:44.217302+0200
3.6 0 0 0 0 0 0
0 0 active+clean 6h 0'0 2627265:2345717
[23,15,6]p23 [23,15,6]p23 2023-09-29T09:26:02.534825+0200
2023-09-27T21:56:57.500126+0200
Best regards,
Malte
Hi all,
I'm affected by a stuck MDS warning for 2 clients: "failing to respond to cache pressure". This is a false alarm as no MDS is under any cache pressure. The warning is stuck already for a couple of days. I found some old threads about cases where the MDS does not update flags/triggers for this warning in certain situations. Dating back to luminous and I'm probably hitting one of these.
In these threads I could find a lot except for instructions for how to clear this out in a nice way. Is there something I can do on the clients to clear this warning? I don't want to evict/reboot just because of that.
Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Hi,
Use case:
* Ceph cluster with old nodes having 6TB HDDs
* Add new node with new 12TB HDDs
Is it supported/recommended to pack 2 6TB HDDs handled by 2 old OSDs
into 1 12TB LVM disk handled by 1 new OSD ?
Regards,
Renaud Miel
Hello,
I have a Nautilus cluster built using Ceph packages from Debian 10
Backports, deployed with Ceph-Ansible.
I see that Debian does not offer Ceph 15/Octopus packages. However,
download.ceph.com does offer such packages.
Question: Is it a safe upgrade to install the download.ceph.com packages
over top of the buster-backports packages?
If so, the next question is how to deploy this? Should I pull down an
appropriate version of Ceph-Ansible and use the rolling-upgrade playbook?
Or just apg-get -f dist-upgrade the new Ceph packages into place?
BTW, in the long run I'll probably want to get to container-based Reef, but
I need to keep a stable cluster throughout.
Any advice or reassurance much appreciated.
Thanks.
-Dave
--
Dave Hall
Binghamton University
kdhall(a)binghamton.edu