[ceph-users] Re: CephFS ghost usage/inodes

15 Jan 2020

the situation is:

health: HEALTH_WARN
  1 pools have many more objects per pg than average

$ ceph health detail
MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average
    pool cephfs_data objects per pg (315399) is more than 1227.23 times
cluster average (257)

$ ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED
    hdd       7.8 TiB     7.4 TiB     326 GiB      343 GiB          4.30
    TOTAL     7.8 TiB     7.4 TiB     326 GiB      343 GiB          4.30

POOLS:
    POOL               ID     STORED      OBJECTS     USED       
%USED     MAX AVAIL
    cephfs_data         6     2.2 TiB       2.52M     2.2 TiB    
26.44       3.0 TiB
    cephfs_metadata     7     9.7 MiB         379     9.7 MiB        
0       3.0 TiB

the stored value of the "cephfs_data" pool is 2.2TiB. This must be
wrong. When i execute "du -sh" from the MDS root "/" i get an usage:

$ du -sh
31G     .

"df -h" shows:

$ df -h
Filesystem       Size  Used Avail Use% Mounted on
ip1,ip2,ip3:/    5.2T  2.2T  3.0T  43% /storage/cephfs

It says that "Used" ist 2.2T but "du" shows 31G

the pg_num from the "cephfs_data" pool is now 8. Autoscale suggest me to
set this parameter to 512

$ ceph osd pool autoscale-status
POOL                        SIZE TARGET SIZE RATE RAW CAPACITY  RATIO
TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
cephfs_metadata            9994k              2.0        7959G
0.0000               1.0      8            off
cephfs_data                2221G              2.0        7959G
0.5582               1.0      8        512 off

after setting pg_num to 512 the situation is:

$ ceph health detail
HEALTH_WARN 1 pools have many more objects per pg than average
MANY_OBJECTS_PER_PG 1 pools have many more objects per pg than average
    pool cephfs_data objects per pg (4928) is more than 100.571 times
cluster average (49)

$ ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED
    hdd       7.8 TiB     7.4 TiB     329 GiB      346 GiB          4.34
    TOTAL     7.8 TiB     7.4 TiB     329 GiB      346 GiB          4.34

POOLS:
    POOL                          ID     STORED      OBJECTS    
USED        %USED     MAX AVAIL
    cephfs_data                    6      30 GiB       2.52M      61
GiB      0.99       3.0 TiB
    cephfs_metadata                7     9.8 MiB         379      20
MiB         0       3.0 TiB

The "stored" value changed from 2.2TiB to 30GiB !!! This should be the
correct usage/size.

When i execute "du -sh" from the MDS root "/" i get again an usage:

$ du -sh
31G

and "df -h" shows again

$ df -h
Filesystem       Size  Used Avail Use% Mounted on
ip1,ip2,ip3:/    5.2T  2.2T  3.0T  43% /storage/cephfs

It says that "Used" ist 2.2T but "du" shows 31G

Can anybody explain me whats the problem ?

Am 14.01.20 um 11:15 schrieb Florian Pritz:
...
  Hi,

 When we tried putting some load on our test cephfs setup by restoring a
 backup in artifactory, we eventually ran out of space (around 95% used
 in `df` = 3.5TB) which caused artifactory to abort the restore and clean
 up. However, while a simple `find` no longer shows the files, `df` still
 claims that we have around 2.1TB of data on the cephfs. `df -i` also
 shows 2.4M used inodes. When using `du -sh` on a top-level mountpoint, I
 get 31G used, which is data that is still really here and which is
 expected to be here.

 Consequently, we also get the following warning:

  MANY_OBJECTS_PER_PG 1 pools have many more
objects per pg than average
     pool cephfs_data objects per pg (38711) is more than 231.802 times cluster average
(167)  We are running ceph 14.2.5.

 We have snapshots enabled on cephfs, but there are currently no active
 snapshots listed by `ceph daemon mds.$hostname dump snaps --server` (see
 below). I can't say for sure if we created snapshots during the backup
 restore.

  {
     "last_snap": 39,
     "last_created": 38,
     "last_destroyed": 39,
     "pending_noop": [],
     "snaps": [],
     "need_to_purge": {},
     "pending_update": [],
     "pending_destroy": []
 }  We only have a single CephFS.

 We use the pool_namespace xattr for our various directory trees on the
 cephfs.

 `ceph df` shows:

  POOL         ID STORED   OBJECTS   USED    %USED 
   MAX AVAIL
 cephfs_data  6  2.1 TiB  2.48M     2.1 TiB 24.97       3.1 TiB  `ceph daemon
mds.$hostname perf dump | grep stray` shows:

  "num_strays": 0,
 "num_strays_delayed": 0,
 "num_strays_enqueuing": 0,
 "strays_created": 5097138,
 "strays_enqueued": 5097138,
 "strays_reintegrated": 0,
 "strays_migrated": 0,  `rados -p cephfs_data df` shows:

  POOL_NAME      USED OBJECTS CLONES  COPIES
MISSING_ON_PRIMARY UNFOUND DEGRADED   RD_OPS      RD   WR_OPS     WR USED COMPR UNDER
COMPR
 cephfs_data 2.1 TiB 2477540      0 4955080                  0       0        0 10699626
6.9 TiB 86911076 35 TiB        0 B         0 B

 total_objects    29718
 total_used       329 GiB
 total_avail      7.5 TiB
 total_space      7.8 TiB  When I combine the usage and the free space shown by `df`
we would
 exceed our cluster size. Our test cluster currently has 7.8TB total
 space with a replication size of 2 for all pools. With 2.1TB
 "used" on the cephfs according to `df` + 3.1TB being shows as "free"
I
 get 5.2TB total size. This would mean >10TB of data when accounted for
 replication. Clearly this can't fit on a cluster with only 7.8TB of
 capacity.

 Do you have any ideas why we see so many objects and so much reported
 usage? Is there any way to fix this without recreating the cephfs?

 Florian

 _______________________________________________
 ceph-users mailing list -- ceph-users(a)ceph.io
 To unsubscribe send an email to ceph-users-leave(a)ceph.io 

2024

2023

2022

2021

2020

2019

[ceph-users] Re: CephFS ghost usage/inodes