Hi,
We had a case reported by our customer, when a faulty disk was
returning ENODATA error on directory split and it created some mess
due to transactions aborting a transaction operation when encountering
the directory split error but not aborting the whole transaction,
exectuting another operations.
The kernel log:
2020-12-16T07:02:36.736166+09:00 node5 kernel: [10270806.635341] sd 0:2:10:0: [sdk] tag#1 BRCM Debug mfi stat 0x2d, data len requested/completed 0x4000/0x0
2020-12-16T07:02:36.736180+09:00 node5 kernel: [10270806.635349] sd 0:2:10:0: [sdk] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
2020-12-16T07:02:36.736181+09:00 node5 kernel: [10270806.635351] sd 0:2:10:0: [sdk] tag#1 Sense Key : Medium Error [current]
2020-12-16T07:02:36.736184+09:00 node5 kernel: [10270806.635353] sd 0:2:10:0: [sdk] tag#1 Add. Sense: Unrecovered read error
2020-12-16T07:02:36.736203+09:00 node5 kernel: [10270806.635355] sd 0:2:10:0: [sdk] tag#1 CDB: Read(16) 88 00 00 00 00 00 02 67 ec 00 00 00 00 20 00 00
2020-12-16T07:02:36.736234+09:00 node5 kernel: [10270806.635357] blk_update_request: critical medium error, dev sdk, sector 40365056
2020-12-16T07:02:36.736240+09:00 node5 kernel: [10270806.635379] XFS (sdk2): metadata I/O error: block 0x1edda00 ("xfs_trans_read_buf_map") error 61 numblks 32
2020-12-16T07:02:36.736241+09:00 node5 kernel: [10270806.635384] XFS (sdk2): xfs_imap_to_bp: xfs_trans_read_buf() returned error -61.
The osd log:
2020-12-16 07:02:36.419810 7f283a43d700 1 _created [2,5,C,A,6] has 447 objects, starting split in pg 5.452s0_head.
2020-12-16 07:02:36.736125 7f283a43d700 1 _created [2,5,C,A,6] split completed in pg 5.452s0_head.
2020-12-16 07:02:36.736150 7f283a43d700 -1 filestore(/var/lib/ceph/osd/ceph-57) error creating 0#5:4a3568ed:::<CENSORED>:head# (/var/lib/ceph/osd/ceph-57/current/5.452s0_head/DIR_2/DIR_5/DIR_C/DIR_A/DIR_6/<CENSORED>__head_B716AC52__5_ffffffffffffffff_0) in index: (61) No data available
So a transaction operation created a new object file, detected that
the directory needed splitting, tried to split, failed, aborted the
operation in the middle, returned the ENODATA error to
FileStore::_do_transaction, but it was ignored and the transaction
conitnued.
We do not have idea where exactly the split was failing but it seemed
it did not cause data loss, but it aborted a transaction operation in
the middle and it could make some mess.
We were seeing at least two types of such "messy" transactions:
1) On rados writing a new objects, one of the first transaction
operations is OP_TOUCH. It creates the object file, tries to split the
directory, aborts and skips creating the object spill_out attribute
due to this.
2) On rados deleting an object, one of the transactions operations is
OP_COLL_MOVE_RENAME, wich creates a temporary link, which triggers the
directory split and the error, the op is aborted in the middle leaving
the original object file not removed.
So it looks like a bug and could be improved, but the question is if
the upsteam is still interested in improving the filestore in this
area? Should I report it to the tracker?
--
Mykola Golub
<p>Rahul is part of a talented team of content writers working in <a href="https://www.fortunebusinessinsights.com">Fortune Business Insights</a>™, one of the most promising market research firms in the industry. He has experience in developing quality content and is currently involved in writing articles, press releases, and blogs for the company. He is highly motivated and enjoys putting ideas and thoughts into words to enable the reader to experience a seamless perusal.</p>
Hi, everyone.
Is teuthology-openstack still used to do large scale testing nowadays?
Recently, we are planning to use teuthology as the basis to build some
large scale ceph testing infrastructure on top of our virtualization
platform. And we thought teuthology-openstack is relatively suitable
for our purpose. However, it seems that some of its code is still
using the PY2 environment and hasn't been maintained for a while,
which brought us to this question.
Thanks:-)
Hi everyone,
Here are the highlights from this week's CLT meeting:
The Ceph Developer Summit Quincy Videos [0] are now available! We'll
be scheduling component specific meetings to prioritize features for
Quincy, based on the discussions we had at CDS. The Ceph backlog
trello [1] will be updated to reflect these. The meeting details will
be added to the community calendar [2] and sent to the mailing list.
Anyone interested in these discussions is welcome to join.
We have started planning a Ceph Virtual event, which will likely be
held in June 2021. The CFP and format of this event will be shared
with the community very soon.
The results of the Ceph User Survey 2021 are now available [3]. The
Ceph User Survey Working Group [4] is working on a summary insight for
2021 in the next month or so.
A point release is being planned for Pacific to address cephadm
upgrade issues from octopus, systemd issue reported in [5] and some
other bugs reported against 16.2.0.
We also discussed the idea of exploring a redmine plugin such as [6]
to identify already reported issues and avoid duplicate bugs.
There were some concerns raised about overlapping community meetings
on the community calendar using the same bluejeans room. Going
forward, we might use separate bluejeans rooms to solve this. If
anybody finds overlapping meetings useful, please chime in.
A quick reminder about these CLT meetings - Anyone involved in the
project is welcome to join them. The time, link, and detailed notes are in [7].
You can also find them in the Ceph community calendar [2].
Thanks,
Neha
[0] https://www.youtube.com/playlist?list=PLrBUGiINAakPEoaacUQwA6Aqv0uS4N7Yi
[1] https://trello.com/b/ugTc2QFH/ceph-backlog
[2] https://calendar.google.com/calendar/embed?src=9ts9c7lt7u1vic2ijvvqqlfpo0%4…
[3] https://drive.google.com/file/d/1YjK8Wha6C5lJjQlHIoSpSbt3_Y55XNPl/view
[4] https://tracker.ceph.com/projects/ceph/wiki/User_Survey_Working_Group
[5] https://tracker.ceph.com/issues/50347
[6] https://www.redmine.org/plugins/didyoumean
[7] https://pad.ceph.com/p/clt-weekly-minutes
Hi,
I'd questions around feature tracker,
https://tracker.ceph.com/issues/10679 that requests for chattr +i
support in cephfs, and wanted to check whether the earlier discussed
approach is still good [1].
A while ago, John Spray suggested that the S_IMMUTABLE flag be stored
in the high bits of the cephfs inode's mode attribute as the inode
doesn't have the i_flags attribute. The cephfs inode's mode attribute
has 32 bits, and seems like only 16 bits are used for access modes and
file type. In comparison, ext4 inode's mode attribute and i_flags
attribute are 16 bits and 32 bits respectively. Should we store the
S_IMMUTABLE flag in the mode attribute, or in a new i_flags attribute?
e2fsprogs' chattr uses FS_IOC_[GS]ETFLAGS ioctls to get/set the
S_IMMUTABLE flag of an inode. The two ioctls need to be added to the
kernel and FUSE clients as mentioned in [2]. Maybe later if required
we can add the FS_IOC_FS[GS]ETXATTR ioctls [3] that can also get/set
inode flags. What interfaces should libcephfs use to get/set the
immutable bit flag?
Sage suggested that the MDS check for the S_IMMUTABLE flag, and not
issue write caps if set, and the clients also check for the flag and
return suitable errno to avoid waiting for the caps. Does this
enforcement sound good? Note that data of immutable files can be
modified using open fds existing prior to the immutable flag being set
on files in linux file systems[4].
Thanks,
Ramana
[1] https://www.spinics.net/lists/ceph-users/msg15221.html
[2] https://www.spinics.net/lists/ceph-users/msg15224.html
[3] https://lore.kernel.org/linux-fsdevel/1451886892-15548-1-git-send-email-dav…
[4] https://lwn.net/Articles/786258/
Hi ceph community,
I found that ceph mgr crashed if the 'mgr module path' is empty, see the log below (I add a log to print the path at line 2):
2021-04-14T03:36:08.428+0000 7f59be07b980 0 ceph version 0712f27 (c0712f27e4c61ab592f2279bfa9c283158a20876) quincy (dev), process ceph-mgr, pid 1103795
2021-04-14T03:36:08.456+0000 7f59be07b980 0 log_channel(cluster) log [INF] : path is
2021-04-14T03:36:08.464+0000 7f59be07b980 -1 *** Caught signal (Aborted) **
in thread 7f59be07b980 thread_name:ceph-mgr
ceph version 0712f27 (c0712f27e4c61ab592f2279bfa9c283158a20876) quincy (dev)
1: /home/zyh/ceph/build//bin/ceph-mgr() [0x133b2da]
2: /lib/x86_64-linux-gnu/libpthread.so.0(+0x153c0) [0x7f59beb223c0]
3: gsignal()
4: abort()
5: /lib/x86_64-linux-gnu/libstdc++.so.6(+0x9e951) [0x7f59be941951]
6: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa47c) [0x7f59be94d47c]
7: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa4e7) [0x7f59be94d4e7]
8: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xaa799) [0x7f59be94d799]
9: /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa228d) [0x7f59be94528d]
10: (std::filesystem::__cxx11::directory_iterator::directory_iterator(std::filesystem::__cxx11::path const&)+0x23) [0x10dd693]
11: (PyModuleRegistry::probe_modules(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const+0x17f) [0x10d917f]
12: (PyModuleRegistry::init()+0x1be) [0x10d885e]
13: (MgrStandby::init()+0xa5e) [0x109cbbe]
14: main()
15: __libc_start_main()
16: _start()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Thanks,
Yong-Hao Zou
zouyonghao1994(a)163.com
I'm happy to announce another release of the go-ceph API
bindings. This is a regular release following our every-two-months release
cadence.
https://github.com/ceph/go-ceph/releases/tag/v0.9.0
Changes in the release are detailed in the link above.
The bindings aim to play a similar role to the "pybind" python bindings in the
ceph tree but for the Go language. These API bindings require the use of cgo.
There are already a few consumers of this library in the wild, including the
ceph-csi project.
Specific questions, comments, bugs etc are best directed at our github issues
tracker or github discussions forum.
--
John Mulligan
phlogistonjohn(a)asynchrono.us
jmulligan(a)redhat.com
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io