Hi Igor,
Thank you for the help.
On 3/16/20 7:47 AM, Igor Fedotov wrote:
OSD-709 has been already expanded, right?
Correct with 'ceph-bluestore-tool --log-level 30 --path
/var/lib/ceph/osd/ceph-709 --command bluefs-bdev-expand'. Does this
expand bluefs and the data allocation?
Is there a way to ask how much data is used by the non-bluefs portion?
[root@obj21 ceph-709]# pwd
/var/lib/ceph/osd/ceph-709
[root@obj21 ceph-709]# ls -la
total 28
drwxrwxrwt. 2 ceph ceph 180 Mar 15 22:21 .
drwxr-x---. 52 ceph ceph 4096 Jan 8 14:44 ..
lrwxrwxrwx. 1 ceph ceph 30 Mar 15 22:21 block ->
/dev/ceph-data-vol20/data-20-0
-rw-------. 1 ceph ceph 37 Mar 15 22:21 ceph_fsid
-rw-------. 1 ceph ceph 37 Mar 15 22:21 fsid
-rw-------. 1 ceph ceph 57 Mar 15 22:21 keyring
-rw-------. 1 ceph ceph 6 Mar 15 22:21 ready
-rw-------. 1 ceph ceph 10 Mar 15 22:21 type
-rw-------. 1 ceph ceph 4 Mar 15 22:21 whoami
[root@obj21 ceph-709]# lvdisplay /dev/ceph-data-vol20/data-20-0
--- Logical volume ---
LV Path /dev/ceph-data-vol20/data-20-0
LV Name data-20-0
VG Name ceph-data-vol20
LV UUID idjDxJ-CbzN-fb1n-Bhzx-86vP-mSAD-HvIq5n
LV Write Access read/write
LV Creation host, time
obj21.umiacs.umd.edu, 2018-10-10 08:15:20 -0400
LV Status available
# open 0
LV Size 200.00 GiB
Current LE 51200
Segments 4
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:33
What's the error reported by fsck?
It doesn't say, but when I try to run the deep fsck it produces
[root@obj21 ceph-709]# ceph-bluestore-tool --log-level 30 --path
/var/lib/ceph/osd/ceph-709 --command fsck
2020-03-16 08:02:16.590 7f5faaa11c00 -1
bluestore(/var/lib/ceph/osd/ceph-709) fsck error: bluefs_extents
inconsistency, downgrade to previous releases might be broken.
fsck found 1 error(s)
[0] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.709-fsck-deep
4) OSD.681 has a number of checksum verification
errors when reading DB
data:
2020-03-15 14:03:52.890 7f6311ffa700 3 rocksdb:
[table/block_based_table_reader.cc:1117] Encountered error while reading
data from compression dictionary block Corruption: block checksum
mismatch: expected 0, got 2324967102 in db/012948.sst offset
18446744073709551615 size 18446744073709551615
Can't say if this is bound to space shortage or not. Wondering if other
OSDs reported(-ing) something similar?
Here is another node which at around '2020-03-15 13:51' starts looking
like peering a few pgs and then at '2020-03-15 14:40' on 716 fails, and
then for example 719 it fails 1 min later at '2020-03-15 14:41'.
[1] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.716.log-20200316.gz
[2] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.717.log-20200316.gz
[3] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.718.log-20200316.gz
[4] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.719.log-20200316.gz
[5] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.720.log-20200316.gz
[6] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.721.log-20200316.gz
[7] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.722.log-20200316.gz
[8] -
ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.722.log-20200316.gz
Thanks,
derek
--
Derek T. Yarnell
Director of Computing Facilities
University of Maryland
Institute for Advanced Computer Studies