On Wed, Apr 28, 2021 at 5:05 AM <hase.jin(a)fujitsu.com> wrote:
Hi,
Let me share current idea about source code modification.
Regarding readdir(), errno is set to indicate the error, so we can check readdir() error
in Ceph codes.
[readdir() specification]
------------------------------------------------------------
https://man7.org/linux/man-pages/man3/readdir.3.html
DESCRIPTION
... It returns NULL on reaching the end of the directory stream or if an error occurred.
RETURN VALUE
... If the end of the directory stream is reached, NULL is returned
and errno is not changed. If an error occurs, NULL is returned
and errno is set to indicate the error. To distinguish end of
stream from an error, set errno to zero before calling readdir()
and then check the value of errno if NULL is returned.
------------------------------------------------------------
How about modify the codes where readdir () is being executed in filestore as follows:
------------------------------------------------------------
diff --git a/src/os/filestore/BtrfsFileStoreBackend.cc
b/src/os/filestore/BtrfsFileStoreBackend.cc
index df1d2452a1f..ad0c4d1bb1e 100644
--- a/src/os/filestore/BtrfsFileStoreBackend.cc
+++ b/src/os/filestore/BtrfsFileStoreBackend.cc
@@ -326,7 +326,13 @@ int BtrfsFileStoreBackend::list_checkpoints(list<string>&
ls)
list<string> snaps;
char path[PATH_MAX];
struct dirent *de;
- while ((de = ::readdir(dir))) {
+ while (true) {
+ errno = 0;
+ de = ::readdir(dir);
+ if (de == nullptr) {
+ ceph_assert(errno == 0);
+ break;
+ }
Yes! Checking the readdir result is something we should have been
doing; ignoring it is a bug. I would adjust this code to also print
the error code to the log, though.
sage
> snprintf(path, sizeof(path), "%s/%s", get_basedir_path().c_str(),
de->d_name);
>
> struct stat st;
> diff --git a/src/os/filestore/FileStore.cc b/src/os/filestore/FileStore.cc
> index f059331b090..1aeec855e9a 100644
> --- a/src/os/filestore/FileStore.cc
> +++ b/src/os/filestore/FileStore.cc
> @@ -4987,7 +4987,13 @@ int FileStore::list_collections(vector<coll_t>& ls,
bool include_temp)
> }
>
> struct dirent *de = nullptr;
> - while ((de = ::readdir(dir))) {
> + while (true) {
> + errno = 0;
> + de = ::readdir(dir);
> + if (de == nullptr) {
> + ceph_assert(errno == 0);
> + break;
> + }
> if (de->d_type == DT_UNKNOWN) {
> // d_type not supported (non-ext[234], btrfs), must stat
> struct stat sb;
> diff --git a/src/os/filestore/LFNIndex.cc b/src/os/filestore/LFNIndex.cc
> index bbf65bdc66a..530d7501ddc 100644
> --- a/src/os/filestore/LFNIndex.cc
> +++ b/src/os/filestore/LFNIndex.cc
> @@ -429,7 +429,13 @@ int LFNIndex::list_objects(const vector<string>
&to_list, int max_objs,
> int r = 0;
> int listed = 0;
> bool end = true;
> - while ((de = ::readdir(dir))) {
> + while (true) {
> + errno = 0;
> + de = ::readdir(dir);
> + if (de == nullptr) {
> + ceph_assert(errno == 0);
> + break;
> + }
> end = false;
> if (max_objs > 0 && listed >= max_objs) {
> break;
> @@ -477,7 +483,13 @@ int LFNIndex::list_subdirs(const vector<string>
&to_list,
> return -errno;
>
> struct dirent *de = nullptr;
> - while ((de = ::readdir(dir))) {
> + while (true) {
> + errno = 0;
> + de = ::readdir(dir);
> + if (de == nullptr) {
> + ceph_assert(errno == 0);
> + break;
> + }
> string short_name(de->d_name);
> string demangled_name;
> if (lfn_is_subdir(short_name, &demangled_name)) {
> ------------------------------------------------------------
>
> I haven't verified this modification yet, but is this idea (check readdir()
errors) acceptable?
> If Ceph can't detect the error, it's hard to deal with.
> But regarding readdir(), it can detect the error, so I would like to fix it.
>
> Jin
> _______________________________________________
> Dev mailing list -- dev(a)ceph.io
> To unsubscribe send an email to dev-leave(a)ceph.io