Hi Robert,

I am not quite sure if I get your question correct, but what I understand is that you want the inbound writes to land on the cache tier, which presumably would be on a faster media, possibily a ssd.

From there you would want it to trickle down to the base tier, which is a EC pool hosted on HDD.

Some of the pointers I have :-

It is better to have seperate media for base and cache , HDD and SSD respectively.

If the intent is never to promote to cache tier on Read, you could set it to a high number such as 3, and at the same time, Make the bloom filter window small.( This basically translates into if the object has been read X number of times in past y seconds)

Keep in mind the larger the window, the more the size of the bloom filter, and hence you would see a increase in osd memory usgae.

I have patch somewhere lurking which disables the promotes, let me check on the same, if this is for a specific case.

If your intent is to have a constant decay rate from the Cache tier to the base tier, here is what you could do.:-

1.Set the Max Objects on the Cache tier TO X

2.Set the Max Size to say Y, this would be normally 60-70 percent of the total cache tier capacity.

3.The flushes would start happening on the first trigger of the above thresholds.

4. You could set the evict age roughly double the time, you expect the data will hit the base tier.

5.Lastly have you tried running cosbench or any related tool, to qualify the IOPS of your base tier with EC enabled, you may. Or require the cache tier at all.

6. There are substantial overheads of a cache tier maintenance, the major being absence of throttles on how the flush happens.

7.A thundering herd of write requests can cause a huge amount of flush to happen to the base tier.

8.IMHO it is suitable and predictable for loads where number of ingress requests can be predicted and there is some kind of rate limiting on the same.

Hope this helps

Thanks

Romit

On Tue, 3 Dec 2019, 04:11 , <ceph-users-request@ceph.io> wrote:

Send ceph-users mailing list submissions to
ceph-users@ceph.io

To subscribe or unsubscribe via email, send a message with subject or
body 'help' to
ceph-users-request@ceph.io

You can reach the person managing the list at
ceph-users-owner@ceph.io

When replying, please edit your Subject line so it is more specific
than "Re: Contents of ceph-users digest..."

Today's Topics:

1. Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)
(Ilya Dryomov)
2. Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)
(Marc Roos)
3. Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)
(Marc Roos)
4. Re: Possible data corruption with 14.2.3 and 14.2.4
(Simon Ironside)
5. Re: ceph node crashed with these errors "kernel: ceph: build_snap_context" (maybe now it is urgent?)
(Marc Roos)
6. Can min_read_recency_for_promote be -1 (Robert LeBlanc)

----------------------------------------------------------------------

Date: Mon, 2 Dec 2019 14:59:05 +0100
From: Ilya Dryomov <idryomov@gmail.com>
Subject: [ceph-users] Re: ceph node crashed with these errors "kernel:
ceph: build_snap_context" (maybe now it is urgent?)
To: Marc Roos <M.Roos@f1-outsourcing.eu>
Cc: ceph-users <ceph-users@ceph.io>, jlayton <jlayton@kernel.org>
Message-ID:
<CAOi1vP-uyxeaKvuxUQbe2nsuXH9-f6_QxcggOCv6LrCBzugJOw@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"

On Mon, Dec 2, 2019 at 1:23 PM Marc Roos <M.Roos@f1-outsourcing.eu> wrote:
>
>
>
> I guess this is related? kworker 100%
>
>
> [Mon Dec 2 13:05:27 2019] SysRq : Show backtrace of all active CPUs
> [Mon Dec 2 13:05:27 2019] sending NMI to all CPUs:
> [Mon Dec 2 13:05:27 2019] NMI backtrace for cpu 0 skipped: idling at pc
> 0xffffffffb0581e94
> [Mon Dec 2 13:05:27 2019] NMI backtrace for cpu 1 skipped: idling at pc
> 0xffffffffb0581e94
> [Mon Dec 2 13:05:27 2019] NMI backtrace for cpu 2 skipped: idling at pc
> 0xffffffffb0581e94
> [Mon Dec 2 13:05:27 2019] NMI backtrace for cpu 3 skipped: idling at pc
> 0xffffffffb0581e94
> [Mon Dec 2 13:05:27 2019] NMI backtrace for cpu 4
> [Mon Dec 2 13:05:27 2019] CPU: 4 PID: 426200 Comm: kworker/4:2 Not
> tainted 3.10.0-1062.4.3.el7.x86_64 #1
> [Mon Dec 2 13:05:27 2019] Hardware name: Supermicro
> X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.0b 05/27/2014
> [Mon Dec 2 13:05:27 2019] Workqueue: ceph-msgr ceph_con_workfn
> [libceph]
> [Mon Dec 2 13:05:27 2019] task: ffffa0c8e1240000 ti: ffffa0ccb6364000
> task.ti: ffffa0ccb6364000
> [Mon Dec 2 13:05:27 2019] RIP: 0010:[<ffffffffc08d7db9>]
> [<ffffffffc08d7db9>] cmpu64_rev+0x19/0x20 [ceph]
> [Mon Dec 2 13:05:27 2019] RSP: 0018:ffffa0ccb6367a20 EFLAGS: 00000202
> [Mon Dec 2 13:05:27 2019] RAX: 0000000000000001 RBX: 0000000000000038
> RCX: 0000000000000008
> [Mon Dec 2 13:05:27 2019] RDX: 0000000000025c33 RSI: ffffa0cbbe380050
> RDI: ffffa0cbbe380030
> [Mon Dec 2 13:05:27 2019] RBP: ffffa0ccb6367a20 R08: 0000000000000018
> R09: 00000000000013ed
> [Mon Dec 2 13:05:27 2019] R10: 0000000000000002 R11: ffffe94994f8e000
> R12: ffffa0cbbe380030
> [Mon Dec 2 13:05:27 2019] R13: ffffffffc08d7da0 R14: ffffa0cbbe380018
> R15: ffffa0cbbe380050
> [Mon Dec 2 13:05:27 2019] FS: 0000000000000000(0000)
> GS:ffffa0d2cfb00000(0000) knlGS:0000000000000000
> [Mon Dec 2 13:05:27 2019] CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> [Mon Dec 2 13:05:27 2019] CR2: 000055a7c413fcb9 CR3: 0000001813010000
> CR4: 00000000000607e0
> [Mon Dec 2 13:05:27 2019] Call Trace:
> [Mon Dec 2 13:05:27 2019] [<ffffffffb019303f>] sort+0x1af/0x260
> [Mon Dec 2 13:05:27 2019] [<ffffffffb0192e60>] ? u32_swap+0x10/0x10
> [Mon Dec 2 13:05:27 2019] [<ffffffffc08d807b>]
> build_snap_context+0x12b/0x290 [ceph]
> [Mon Dec 2 13:05:27 2019] [<ffffffffc08d820c>]
> rebuild_snap_realms+0x2c/0x90 [ceph]
> [Mon Dec 2 13:05:27 2019] [<ffffffffc08d822b>]
> rebuild_snap_realms+0x4b/0x90 [ceph]
> [Mon Dec 2 13:05:27 2019] [<ffffffffc08d91fc>]
> ceph_update_snap_trace+0x3ec/0x530 [ceph]
> [Mon Dec 2 13:05:27 2019] [<ffffffffc08e2239>]
> handle_reply+0x359/0xc60 [ceph]
> [Mon Dec 2 13:05:27 2019] [<ffffffffc08e48ba>] dispatch+0x11a/0xb00
> [ceph]
> [Mon Dec 2 13:05:27 2019] [<ffffffffb042e56a>] ?
> kernel_recvmsg+0x3a/0x50
> [Mon Dec 2 13:05:27 2019] [<ffffffffc05fcff4>] try_read+0x544/0x1300
> [libceph]
> [Mon Dec 2 13:05:27 2019] [<ffffffffafee13ce>] ?
> account_entity_dequeue+0xae/0xd0
> [Mon Dec 2 13:05:27 2019] [<ffffffffafee4d5c>] ?
> dequeue_entity+0x11c/0x5e0
> [Mon Dec 2 13:05:27 2019] [<ffffffffb042e417>] ?
> kernel_sendmsg+0x37/0x50
> [Mon Dec 2 13:05:27 2019] [<ffffffffc05fdfb4>]
> ceph_con_workfn+0xe4/0x1530 [libceph]
> [Mon Dec 2 13:05:27 2019] [<ffffffffb057f568>] ?
> __schedule+0x448/0x9c0
> [Mon Dec 2 13:05:27 2019] [<ffffffffafebe21f>]
> process_one_work+0x17f/0x440
> [Mon Dec 2 13:05:27 2019] [<ffffffffafebf336>]
> worker_thread+0x126/0x3c0
> [Mon Dec 2 13:05:27 2019] [<ffffffffafebf210>] ?
> manage_workers.isra.26+0x2a0/0x2a0
> [Mon Dec 2 13:05:27 2019] [<ffffffffafec61f1>] kthread+0xd1/0xe0
> [Mon Dec 2 13:05:27 2019] [<ffffffffafec6120>] ?
> insert_kthread_work+0x40/0x40
> [Mon Dec 2 13:05:27 2019] [<ffffffffb058cd37>]
> ret_from_fork_nospec_begin+0x21/0x21
> [Mon Dec 2 13:05:27 2019] [<ffffffffafec6120>] ?
> insert_kthread_work+0x40/0x40
> [Mon Dec 2 13:05:27 2019] Code: 87 c8 fc ff ff 5d 0f 94 c0 0f b6 c0 c3
> 0f 1f 44 00 00 66 66 66 66 90 48 8b 16 48 39 17 b8 01 00 00 00 55 48 89
> e5 72 08 0f 97 c0 <0f> b6 c0 f7 d8 5d c3 66 66 66 66 90 55 f6 05 ed 92
> 02 00 04 48
> [Mon Dec 2 13:05:27 2019] NMI backtrace for cpu 5

Yes, seems related. I'm not sure how it relates to an upgrade to
nautilus, but as I mentioned in a different message, with thousands of
snapshots you are in a dangerous territory anyway.

Thanks,

Ilya

------------------------------

Date: Mon, 2 Dec 2019 15:06:54 +0100
From: "Marc Roos" <M.Roos@f1-outsourcing.eu>
Subject: [ceph-users] Re: ceph node crashed with these errors "kernel:
ceph: build_snap_context" (maybe now it is urgent?)
To: idryomov <idryomov@gmail.com>
Cc: ceph-users <ceph-users@ceph.io>, jlayton <jlayton@kernel.org>
Message-ID: <"H000007100158998.1575295614.sx.f1-outsourcing.eu*"@MHS>
Content-Type: text/plain; charset="UTF-8"

>
>> >
>> >ISTR there were some anti-spam measures put in place. Is your
account
>> >waiting for manual approval? If so, David should be able to help.
>>
>> Yes if I remember correctly I get waiting approval when I try to log
in.
>>
>> >>
>> >>
>> >>
>> >> Dec 1 03:14:36 c04 kernel: ceph: build_snap_context 100020c9287
>> >> ffff911a9a26bd00 fail -12
>> >> Dec 1 03:14:36 c04 kernel: ceph: build_snap_context 100020c9283
>> >
>> >
>> >It is failing to allocate memory. "low load" isn't very specific,
>> >can you describe the setup and the workload in more detail?
>>
>> 4 nodes (osd, mon combined), the 4th node has local cephfs mount,
which
>> is rsync'ing some files from vm's. 'low load' I have sort of test
setup,
>> going to production. Mostly the nodes are below a load of 1 (except
when
>> the concurrent rsync starts)
>>
>> >How many snapshots do you have?
>>
>> Don't know how to count them. I have script running on a 2000 dirs.
If
>> one of these dirs is not empty it creates a snapshot. So in theory I
>> could have 2000 x 7 days = 14000 snapshots.
>> (btw the cephfs snapshots are in a different tree than the rsync is
>> using)
>
>Is there a reason you are snapshotting each directory individually
>instead of just snapshotting a common parent?

Yes because I am not sure the snapshot frequency on all folders is going
to be the same.

>If you have thousands of snapshots, you may eventually hit a different
>bug:
>
>https://tracker.ceph.com/issues/21420
>https://docs.ceph.com/docs/master/cephfs/experimental-features/#snapsh
ots
>
>Be aware that each set of 512 snapshots amplify your writes by 4K in
>terms of network consumption. With 14000 snapshots, a 4K write would
>need to transfer ~109K worth of snapshot metadata to carry itself out.
>

Also when I am not even writing to a tree with snapshots enabled? I am
rsyncing to dir3

.
├── dir1
│   ├── dira
│   │   └── .snap
│   ├── dirb
│   ├── dirc
│   │   └── .snap
│   └── dird
│    └── .snap
├── dir2
└── dir3

------------------------------

Date: Mon, 2 Dec 2019 16:29:07 +0100
From: "Marc Roos" <M.Roos@f1-outsourcing.eu>
Subject: [ceph-users] Re: ceph node crashed with these errors "kernel:
ceph: build_snap_context" (maybe now it is urgent?)
To: idryomov <idryomov@gmail.com>
Cc: ceph-users <ceph-users@ceph.io>, jlayton <jlayton@kernel.org>
Message-ID: <"H000007100158aca.1575300547.sx.f1-outsourcing.eu*"@MHS>
Content-Type: text/plain; charset="UTF-8"

I can confirm that removing all the snapshots seems to resolve the
problem.

A - I would propose a redesign of something like that snapshots from
below the mountpoint are only taken into account and not snapshots in
the entire filesystem. That should fix a lot of issues

B - That reminds me about this mv command, that does not move data
across different pools in the fs. I would like to see this. Because it
is the logical thing to expect.

>
>> >
>> >ISTR there were some anti-spam measures put in place. Is your
account >> >waiting for manual approval? If so, David should be able
to help.
>>
>> Yes if I remember correctly I get waiting approval when I try to log
in.
>>
>> >>
>> >>
>> >>
>> >> Dec 1 03:14:36 c04 kernel: ceph: build_snap_context 100020c9287
>> >> ffff911a9a26bd00 fail -12 >> >> Dec 1 03:14:36 c04 kernel:
ceph: build_snap_context 100020c9283 >> > >> > >> >It is failing
to allocate memory. "low load" isn't very specific, >> >can you
describe the setup and the workload in more detail?
>>
>> 4 nodes (osd, mon combined), the 4th node has local cephfs mount,
which >> is rsync'ing some files from vm's. 'low load' I have sort of
test setup, >> going to production. Mostly the nodes are below a load
of 1 (except when >> the concurrent rsync starts) >> >> >How many
snapshots do you have?
>>
>> Don't know how to count them. I have script running on a 2000 dirs.
If
>> one of these dirs is not empty it creates a snapshot. So in theory I
>> could have 2000 x 7 days = 14000 snapshots.
>> (btw the cephfs snapshots are in a different tree than the rsync is
>> using) > >Is there a reason you are snapshotting each directory
individually >instead of just snapshotting a common parent?

Yes because I am not sure the snapshot frequency on all folders is going
to be the same.

>If you have thousands of snapshots, you may eventually hit a different
>bug:
>
>https://tracker.ceph.com/issues/21420
>https://docs.ceph.com/docs/master/cephfs/experimental-features/#snapsh
ots
>
>Be aware that each set of 512 snapshots amplify your writes by 4K in
>terms of network consumption. With 14000 snapshots, a 4K write would
>need to transfer ~109K worth of snapshot metadata to carry itself out.
>

Also when I am not even writing to a tree with snapshots enabled? I am
rsyncing to dir3

.
├── dir1
│   ├── dira
│   │   └── .snap
│   ├── dirb
│   ├── dirc
│   │   └── .snap
│   └── dird
│    └── .snap
├── dir2
└── dir3
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
email to ceph-users-leave@ceph.io

------------------------------

Date: Mon, 2 Dec 2019 15:54:54 +0000
From: Simon Ironside <sironside@caffetine.org>
Subject: [ceph-users] Re: Possible data corruption with 14.2.3 and
14.2.4
To: ceph-users@ceph.io
Message-ID: <21d057e9-0088-4847-6d40-19cf2c848395@caffetine.org>
Content-Type: text/plain; charset=utf-8; format=flowed

Any word on 14.2.5? Nervously waiting here . . .

Thanks,
Simon.

On 18/11/2019 11:29, Simon Ironside wrote:

> I will sit tight and wait for 14.2.5.
>
> Thanks again,
> Simon.

------------------------------

Date: Mon, 2 Dec 2019 19:32:03 +0100
From: "Marc Roos" <M.Roos@f1-outsourcing.eu>
Subject: [ceph-users] Re: ceph node crashed with these errors "kernel:
ceph: build_snap_context" (maybe now it is urgent?)
To: ceph-users <ceph-users@ceph.io>, lhenriques <lhenriques@suse.com>
Message-ID: <"H000007100158b41.1575311519.sx.f1-outsourcing.eu*"@MHS>
Content-Type: text/plain; charset="ISO-8859-1"

Yes Luis, good guess!! ;)

-----Original Message-----
Cc: ceph-users
Subject: Re: [ceph-users] ceph node crashed with these errors "kernel:
ceph: build_snap_context" (maybe now it is urgent?)

On Mon, Dec 02, 2019 at 10:27:21AM +0100, Marc Roos wrote:
>
> I have been asking before[1]. Since Nautilus upgrade I am having
> these, with a total node failure as a result(?). Was not expecting
> this in my 'low load' setup. Maybe now someone can help resolving
> this? I am also waiting quite some time to get access at
> https://tracker.ceph.com/issues.

Just a wild guess: do you have a lot of snapshots (> ~400)? If so,
that's probably the problem. See [1] and [2].

[1]
https://docs.ceph.com/docs/master/cephfs/experimental-features/#snapshots
[2] https://tracker.ceph.com/issues/21420

Cheers,
--
Luís

>
>
> Dec 1 03:14:36 c04 kernel: ceph: build_snap_context 100020c9287
> ffff911a9a26bd00 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9283 ffff911d34e69d00 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9276
> ffff911d34e69c00 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c926c ffff912068b92c00 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9268
> ffff912068b93000 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c926d ffff912068b92900 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c928a
> ffff912118e5be00 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9272 ffff9119950d9500 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9269
> ffff911940f3d000 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9270 ffff911748427c00 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c926b
> ffff91169b000600 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9281 ffff91169b000500 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9288
> ffff9115844d2500 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c927d ffff9115844d2e00 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9280
> ffff91186401b000 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9267 ffff9121535ecc00 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c927c
> ffff9121cecb1e00 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9271 ffff9121cecb0400 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9279
> ffff911d26646300 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c927f ffff911d26646900 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9275
> ffff9121cecb1700 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9259 ffff91170c9f6600 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9257
> ffff9118ef2a8000 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c924e ffff911a1e091800 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9262
> ffff911a1e090c00 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9266 ffff9115e3859500 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c924f
> ffff9118aefd1300 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c925f ffff91170c9f6100 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9252
> ffff9115e3859800 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9256 ffff912045dc5300 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020c9254
> ffff91170c9f6900 fail -12 Dec 1 03:14:36 c04 kernel: ceph:
> build_snap_context 100020c9261 ffff91170c9f7100 fail -12 Dec 1
> 03:14:36 c04 kernel: ceph: build_snap_context 100020d4ec4
> ffff9118aefd0000 fail -12
>
> [1]
> https://www.mail-archive.com/ceph-users@ceph.io/msg01088.html
> https://www.mail-archive.com/ceph-users@ceph.io/msg00969.html
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-leave@ceph.io

------------------------------

Date: Mon, 2 Dec 2019 14:39:26 -0800
From: Robert LeBlanc <robert@leblancnet.us>
Subject: [ceph-users] Can min_read_recency_for_promote be -1
To: ceph-users <ceph-users@ceph.io>
Message-ID:
<CAANLjFoecdW7oBh78L3dNO83C-DpDmqXw-kKtT+ShNKXjsqKJg@mail.gmail.com>
Content-Type: multipart/alternative;
boundary="00000000000024d4be0598c041cd"

--00000000000024d4be0598c041cd
Content-Type: text/plain; charset="UTF-8"

I'd like to configure a cache tier to act as a write buffer, so that if
writes come in, it promotes objects, but reads never promote an object. We
have a lot of cold data so we would like to tier down to an EC pool
(CephFS) after a period of about 30 days to save space. The storage tier
and the 'cache' tier would be on the same spindles, so the only performance
improvement would be from the faster writes with replication. So we don't
want to really move data between tiers.

The idea would be to not promote on read since EC read performance is good
enough and have writes go to the cache tier where the data may be 'hot' for
a week or so, then get cold.

It seems that we would only need one hit_set and if -1 can't be set for
min_read_recency_for_promote, I could probably use 2 which would never hit
because there is only one set, but that may error too. The follow up is how
big a set should be as it only really tells if an object "may" be in cache
and does not determine when things are flushed, so it really only matters
how out-of-date we are okay with the bloom filter being out of date, right?
So we could have it be a day long if we are okay with that stale rate? Is
there any advantage to having a longer period for a bloom filter? Now, I'm
starting to wonder if I even need a bloom filter for this use case, can I
get tiering to work without it and only use
cache_min_flush_age/cach_min_evict_age since I don't care about promoting
when there are X hits in Y time?

Thanks
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

--00000000000024d4be0598c041cd
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">I'd like to configure a cache tier to act as a write b=
uffer, so that if writes come in, it promotes objects, but reads never prom=
ote an object. We have a lot of cold data so we would like to tier down to =
an EC pool (CephFS) after a period of about 30 days to save space. The stor=
age tier and the 'cache' tier would be on the same spindles, so the=
only performance improvement would be from the faster writes with replicat=
ion. So we don't want to really move data between tiers.<div><br></div>=
<div>The idea would be to not promote on read since EC read performance is =
good enough and have writes go to the cache tier where the data may be &#39=
;hot' for a week or so, then get cold.</div><div><br></div><div>It seem=
s that we would only need one hit_set and if -1 can't be set for min_re=
ad_recency_for_promote, I could probably use 2 which would never hit becaus=
e there is only one set, but that may error too. The follow up is how big a=
set should be as it only really tells if an object "may" be in c=
ache and does not determine when things are flushed, so it really only matt=
ers how out-of-date we are okay with the bloom filter being out of date, ri=
ght? So we could have it be a day long if we are okay with that stale rate?=
Is there any advantage to having a longer period for a bloom filter? Now, =
I'm starting to wonder if I even need a bloom filter for this use case,=
can I get tiering to work without it and only use cache_min_flush_age/cach=
_min_evict_age since I don't care about promoting when there are X hits=
in Y time?</div><div><br></div><div>Thanks<br clear=3D"all"><div><div dir=
=3D"ltr" class=3D"gmail_signature" data-smartmail=3D"gmail_signature">-----=
-----------<br>Robert LeBlanc<br>PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 =
=C2=A0C70E E654 3BB2 FA62 B9F1</div></div></div></div>

--00000000000024d4be0598c041cd--

------------------------------

Subject: Digest Footer

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-leave@ceph.io
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

------------------------------

End of ceph-users Digest, Vol 83, Issue 5
*****************************************