Update: the primary data pool (con-fs2-meta2) does store data:
con-fs2-meta1 12 240 MiB 0.02 1.1 TiB 6437
con-fs2-meta2 13 0 B 0 373 TiB 72167
con-fs2-data 14 103 GiB 0.01 894 TiB 88923
There is also activity
pool con-fs2-meta1 id 12
client io 0 B/s rd, 3.0 MiB/s wr, 5 op/s rd, 119 op/s wr
pool con-fs2-meta2 id 13
client io 0 B/s wr, 0 op/s rd, 682 op/s wr
pool con-fs2-data id 14
client io 19 MiB/s wr, 0 op/s rd, 163 op/s wr
It seems that data for hard links is relevant, other objects might just be transient and
therefore never seen.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Frank Schilder <frans(a)dtu.dk>
Sent: 31 January 2020 18:19:34
To: Gregory Farnum; CASS Philip
Cc: ceph-users(a)ceph.io
Subject: [ceph-users] Re: CephFS - objects in default data pool
Dear Gregory and Philip,
I'm also experimenting with a replicated primary data pool and an erasure-coded
secondary data pool. I make the same observation with regards to objects and activity as
Philip. However, is does seem to make a difference. If I run a very aggressive fio test as
in:
fio --ioengine=libaio --direct=1 --name=test --filename=test --bs=4k --size=100G
--runtime=5m --readwrite=randwrite -iodepth=4096
or iodepth even higher, I observe "slow metadata IOs" on an fs with meta data on
replicated ssd pool and just a primary EC data pool. On the other hand, I do not observe
"slow metadata IOs" on an fs with the three-pool layout. In both cases I observe
"slow ops" though.
This result would indicate that the replicated primary data pool in front of the EC
secondary data pool does indeed have an effect. Strangely though, I cannot see any
activity on this pool with pool stats and neither are there any objects.
Is there any way to check if anything is on this pool and how much storage it uses?
"Ceph df" is not helping and neither is "rados ls", which is a bit of
an issue when it comes to sizing.
Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
________________________________________
From: Gregory Farnum <gfarnum(a)redhat.com>
Sent: 28 January 2020 18:13:29
To: CASS Philip
Cc: ceph-users(a)ceph.io
Subject: [ceph-users] Re: CephFS - objects in default data pool
On Tue, Jan 28, 2020 at 4:26 PM CASS Philip
<p.cass@epcc.ed.ac.uk<mailto:p.cass@epcc.ed.ac.uk>> wrote:
I have a query about
https://docs.ceph.com/docs/master/cephfs/createfs/:
“The data pool used to create the file system is the “default” data pool and the location
for storing all inode backtrace information, used for hard link management and disaster
recovery. For this reason, all inodes created in CephFS have at least one object in the
default data pool.”
This does not match my experience (nautilus servers, nautlius FUSE client or Centos 7
kernel client). I have a cephfs with a replicated top-level pool and a directory set to
use erasure coding with setfattr, though I also did the same test using the subvolume
commands with the same result. "Ceph df detail" shows no objects used in the
top level pool, as shown in
https://gist.github.com/pcass-epcc/af24081cf014a66809e801f33bcb535b (also displayed
in-line below)
Hmm I think this is tripping over the longstanding issue that omap data is not reflected
in the pool stats (although I would expect it to still show up as objects, but perhaps the
"ceph df" view has a different reporting chain? Or else I'm confused
somehow.)
But anyway...
It would be useful if indeed clients didn’t have to write to the top-level pool, since
that would mean we could give different clients permission only to pool-associated
subdirectories without giving everyone write access to a pool with data structures shared
between all users of the filesystem.
*Clients* don't need write permission to the default data pool unless you want them to
write files there. The backtraces are maintained by the MDS. :)
-Greg
[root@hdr-admon01 ec]# ceph df detail; ceph fs ls; ceph fs status
RAW STORAGE:
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 3.3 PiB 3.3 PiB 32 TiB 32 TiB 0.95
nvme 2.9 TiB 2.9 TiB 504 MiB 2.5 GiB 0.08
TOTAL 3.3 PiB 3.3 PiB 32 TiB 32 TiB 0.95
POOLS:
POOL ID STORED OBJECTS USED %USED
MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR
cephfs.fs1.metadata 5 162 MiB 63 324 MiB 0.01
1.4 TiB N/A N/A 63 0 B 0 B
cephfs.fs1-replicated.data 6 0 B 0 0 B 0
1.0 PiB N/A N/A 0 0 B 0 B
cephfs.fs1-ec.data 7 8.0 GiB 2.05k 11 GiB 0
2.4 PiB N/A N/A 2.05k 0 B 0 B
name: fs1, metadata pool: cephfs.fs1.metadata, data pools: [cephfs.fs1-replicated.data
cephfs.fs1-ec.data ]
fs1 - 4 clients
===
+------+--------+------------+---------------+-------+-------+
| Rank | State | MDS | Activity | dns | inos |
+------+--------+------------+---------------+-------+-------+
| 0 | active | hdr-meta02 | Reqs: 0 /s | 29 | 16 |
+------+--------+------------+---------------+-------+-------+
+----------------------------+----------+-------+-------+
| Pool | type | used | avail |
+----------------------------+----------+-------+-------+
| cephfs.fs1.metadata | metadata | 324M | 1414G |
| cephfs.fs1-replicated.data | data | 0 | 1063T |
| cephfs.fs1-ec.data | data | 11.4G | 2505T |
+----------------------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
| hdr-meta01 |
+-------------+
MDS version: ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus
(stable)
[root@hdr-admon01 ec]# ll /test-fs/ec/
total 12582912
-rw-r--r--. 1 root root 4294967296 Jan 27 22:26 new-file
-rw-r--r--. 2 root root 4294967296 Jan 28 14:06 new-file2
-rw-r--r--. 2 root root 4294967296 Jan 28 14:06 new-file-same-inode-as-newfile2
Regards,
Phil
_________________________________________
Philip Cass
HPC Systems Specialist – Senior Systems Administrator
EPCC
[cid:16fed2141935b16b21]
Advanced Computing Facility
Bush Estate
Penicuik
Tel: +44 (0)131 4457815
Email:
p.cass@epcc.ed.ac.uk<mailto:p.cass@epcc.ed.ac.uk>
_________________________________________
The University of Edinburgh is a charitable body, registered in Scotland, with
registration number SC005336.
The information contained in this e-mail (including any attachments) is confidential and
is intended for the use of the addressee only. If you have received this message in
error, please delete it and notify the originator immediately.
Please consider the environment before printing this email.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to
ceph-users-leave@ceph.io<mailto:ceph-users-leave@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io