Hi,
I have the following Ceph Mimic setup :
- a bunch of old servers with 3-4 SATA drives each (74 OSDs in total)
- index/leveldb is stored on each OSD (so no SSD drives, just SATA)
- the current usage is :
GLOBAL:
SIZE AVAIL RAW USED %RAW USED
542 TiB 105 TiB 437 TiB 80.67
POOLS:
NAME ID USED %USED MAX
AVAIL OBJECTS
.rgw.root 1 1.1 KiB 0 26
TiB 4
default.rgw.control 2 0 B 0 26
TiB 8
default.rgw.meta 3 20 MiB 0 26
TiB 75357
default.rgw.log 4 0 B 0 26
TiB 4271
default.rgw.buckets.data 5 290 TiB 85.05 51 TiB
78067284
default.rgw.buckets.non-ec 6 0 B 0 26
TiB 0
default.rgw.buckets.index 7 0 B 0 26
TiB 603008
- rgw_override_bucket_index_max_shards = 16. Clients are accessing RGW
via Swift, not S3.
- the replication schema is EC 4+2.
We are using this Ceph cluster as a secondary storage for another
storage infrastructure (which is more expensive) and we are offloading
cold data (big files with a low number of downloads/reads from our
customer). This way we can lower the TCO . So most of the files are big
( a few GB at least).
So far Ceph is doing well considering that I don't have big
expectations from current hardware. I'm a bit worried however that we
have 78 M objects with max_shards=16 and we will probably reach 100M in
the next few months. Do I need a increase the max shards to ensure the
stability of the cluster ? I read that storing more than 1 M of objects
in a single bucket can lead to OSD's flapping or having io timeouts
during deep-scrub or even to have ODS's failures due to the leveldb
compacting all the time if we have a large number of DELETEs.
Any advice would be appreciated.
Thank you,
Adrian Nicolae
Show replies by date
Can you store your data in different buckets?
linyunfan
Adrian Nicolae <adrian.nicolae(a)rcs-rds.ro> 于2020年5月19日周二 下午3:32写道:
>
> Hi,
>
> I have the following Ceph Mimic setup :
>
> - a bunch of old servers with 3-4 SATA drives each (74 OSDs in total)
>
> - index/leveldb is stored on each OSD (so no SSD drives, just SATA)
>
> - the current usage is :
>
> GLOBAL:
> SIZE AVAIL RAW USED %RAW USED
> 542 TiB 105 TiB 437 TiB 80.67
> POOLS:
> NAME ID USED %USED MAX
> AVAIL OBJECTS
> .rgw.root 1 1.1 KiB 0 26
> TiB 4
> default.rgw.control 2 0 B 0 26
> TiB 8
> default.rgw.meta 3 20 MiB 0 26
> TiB 75357
> default.rgw.log 4 0 B 0 26
> TiB 4271
> default.rgw.buckets.data 5 290 TiB 85.05 51 TiB
> 78067284
> default.rgw.buckets.non-ec 6 0 B 0 26
> TiB 0
> default.rgw.buckets.index 7 0 B 0 26
> TiB 603008
>
> - rgw_override_bucket_index_max_shards = 16. Clients are accessing RGW
> via Swift, not S3.
>
> - the replication schema is EC 4+2.
>
> We are using this Ceph cluster as a secondary storage for another
> storage infrastructure (which is more expensive) and we are offloading
> cold data (big files with a low number of downloads/reads from our
> customer). This way we can lower the TCO . So most of the files are big
> ( a few GB at least).
>
> So far Ceph is doing well considering that I don't have big
> expectations from current hardware. I'm a bit worried however that we
> have 78 M objects with max_shards=16 and we will probably reach 100M in
> the next few months. Do I need a increase the max shards to ensure the
> stability of the cluster ? I read that storing more than 1 M of objects
> in a single bucket can lead to OSD's flapping or having io timeouts
> during deep-scrub or even to have ODS's failures due to the leveldb
> compacting all the time if we have a large number of DELETEs.
>
> Any advice would be appreciated.
>
>
> Thank you,
>
> Adrian Nicolae
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io
> To unsubscribe send an email to ceph-users-leave(a)ceph.io
I'm using only Swift , not S3. We have a container for every customer.
Right now there are thousands of containers.
On 5/25/2020 9:02 AM, lin yunfan wrote:
> Can you store your data in different buckets?
>
> linyunfan
>
> Adrian Nicolae <adrian.nicolae(a)rcs-rds.ro> 于2020年5月19日周二 下午3:32写道:
>> Hi,
>>
>> I have the following Ceph Mimic setup :
>>
>> - a bunch of old servers with 3-4 SATA drives each (74 OSDs in total)
>>
>> - index/leveldb is stored on each OSD (so no SSD drives, just SATA)
>>
>> - the current usage is :
>>
>> GLOBAL:
>> SIZE AVAIL RAW USED %RAW USED
>> 542 TiB 105 TiB 437 TiB 80.67
>> POOLS:
>> NAME ID USED %USED MAX
>> AVAIL OBJECTS
>> .rgw.root 1 1.1 KiB 0 26
>> TiB 4
>> default.rgw.control 2 0 B 0 26
>> TiB 8
>> default.rgw.meta 3 20 MiB 0 26
>> TiB 75357
>> default.rgw.log 4 0 B 0 26
>> TiB 4271
>> default.rgw.buckets.data 5 290 TiB 85.05 51 TiB
>> 78067284
>> default.rgw.buckets.non-ec 6 0 B 0 26
>> TiB 0
>> default.rgw.buckets.index 7 0 B 0 26
>> TiB 603008
>>
>> - rgw_override_bucket_index_max_shards = 16. Clients are accessing RGW
>> via Swift, not S3.
>>
>> - the replication schema is EC 4+2.
>>
>> We are using this Ceph cluster as a secondary storage for another
>> storage infrastructure (which is more expensive) and we are offloading
>> cold data (big files with a low number of downloads/reads from our
>> customer). This way we can lower the TCO . So most of the files are big
>> ( a few GB at least).
>>
>> So far Ceph is doing well considering that I don't have big
>> expectations from current hardware. I'm a bit worried however that we
>> have 78 M objects with max_shards=16 and we will probably reach 100M in
>> the next few months. Do I need a increase the max shards to ensure the
>> stability of the cluster ? I read that storing more than 1 M of objects
>> in a single bucket can lead to OSD's flapping or having io timeouts
>> during deep-scrub or even to have ODS's failures due to the leveldb
>> compacting all the time if we have a large number of DELETEs.
>>
>> Any advice would be appreciated.
>>
>>
>> Thank you,
>>
>> Adrian Nicolae
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
I think the shard number recommendation is 100K objects/per shard/per
bucket. If you have many objects but they are spread in many
buckets/containers and each bucket/container have less than 1.6M
objects(max_shards=16) then you should be ok.
linyunfan
Adrian Nicolae <adrian.nicolae(a)rcs-rds.ro> 于2020年5月25日周一 下午3:04写道:
>
> I'm using only Swift , not S3. We have a container for every customer.
> Right now there are thousands of containers.
>
>
>
> On 5/25/2020 9:02 AM, lin yunfan wrote:
> > Can you store your data in different buckets?
> >
> > linyunfan
> >
> > Adrian Nicolae <adrian.nicolae(a)rcs-rds.ro> 于2020年5月19日周二 下午3:32写道:
> >> Hi,
> >>
> >> I have the following Ceph Mimic setup :
> >>
> >> - a bunch of old servers with 3-4 SATA drives each (74 OSDs in total)
> >>
> >> - index/leveldb is stored on each OSD (so no SSD drives, just SATA)
> >>
> >> - the current usage is :
> >>
> >> GLOBAL:
> >> SIZE AVAIL RAW USED %RAW USED
> >> 542 TiB 105 TiB 437 TiB 80.67
> >> POOLS:
> >> NAME ID USED %USED MAX
> >> AVAIL OBJECTS
> >> .rgw.root 1 1.1 KiB 0 26
> >> TiB 4
> >> default.rgw.control 2 0 B 0 26
> >> TiB 8
> >> default.rgw.meta 3 20 MiB 0 26
> >> TiB 75357
> >> default.rgw.log 4 0 B 0 26
> >> TiB 4271
> >> default.rgw.buckets.data 5 290 TiB 85.05 51 TiB
> >> 78067284
> >> default.rgw.buckets.non-ec 6 0 B 0 26
> >> TiB 0
> >> default.rgw.buckets.index 7 0 B 0 26
> >> TiB 603008
> >>
> >> - rgw_override_bucket_index_max_shards = 16. Clients are accessing RGW
> >> via Swift, not S3.
> >>
> >> - the replication schema is EC 4+2.
> >>
> >> We are using this Ceph cluster as a secondary storage for another
> >> storage infrastructure (which is more expensive) and we are offloading
> >> cold data (big files with a low number of downloads/reads from our
> >> customer). This way we can lower the TCO . So most of the files are big
> >> ( a few GB at least).
> >>
> >> So far Ceph is doing well considering that I don't have big
> >> expectations from current hardware. I'm a bit worried however that we
> >> have 78 M objects with max_shards=16 and we will probably reach 100M in
> >> the next few months. Do I need a increase the max shards to ensure the
> >> stability of the cluster ? I read that storing more than 1 M of objects
> >> in a single bucket can lead to OSD's flapping or having io timeouts
> >> during deep-scrub or even to have ODS's failures due to the leveldb
> >> compacting all the time if we have a large number of DELETEs.
> >>
> >> Any advice would be appreciated.
> >>
> >>
> >> Thank you,
> >>
> >> Adrian Nicolae
> >>
> >>
> >>
> >> _______________________________________________
> >> ceph-users mailing list -- ceph-users(a)ceph.io
> >> To unsubscribe send an email to ceph-users-leave(a)ceph.io
I understand now.
Thank you very much for your input.
On 5/25/2020 3:28 PM, lin yunfan wrote:
> I think the shard number recommendation is 100K objects/per shard/per
> bucket. If you have many objects but they are spread in many
> buckets/containers and each bucket/container have less than 1.6M
> objects(max_shards=16) then you should be ok.
>
> linyunfan
>
> Adrian Nicolae <adrian.nicolae(a)rcs-rds.ro> 于2020年5月25日周一 下午3:04写道:
>> I'm using only Swift , not S3. We have a container for every customer.
>> Right now there are thousands of containers.
>>
>>
>>
>> On 5/25/2020 9:02 AM, lin yunfan wrote:
>>> Can you store your data in different buckets?
>>>
>>> linyunfan
>>>
>>> Adrian Nicolae <adrian.nicolae(a)rcs-rds.ro> 于2020年5月19日周二 下午3:32写道:
>>>> Hi,
>>>>
>>>> I have the following Ceph Mimic setup :
>>>>
>>>> - a bunch of old servers with 3-4 SATA drives each (74 OSDs in total)
>>>>
>>>> - index/leveldb is stored on each OSD (so no SSD drives, just SATA)
>>>>
>>>> - the current usage is :
>>>>
>>>> GLOBAL:
>>>> SIZE AVAIL RAW USED %RAW USED
>>>> 542 TiB 105 TiB 437 TiB 80.67
>>>> POOLS:
>>>> NAME ID USED %USED MAX
>>>> AVAIL OBJECTS
>>>> .rgw.root 1 1.1 KiB 0 26
>>>> TiB 4
>>>> default.rgw.control 2 0 B 0 26
>>>> TiB 8
>>>> default.rgw.meta 3 20 MiB 0 26
>>>> TiB 75357
>>>> default.rgw.log 4 0 B 0 26
>>>> TiB 4271
>>>> default.rgw.buckets.data 5 290 TiB 85.05 51 TiB
>>>> 78067284
>>>> default.rgw.buckets.non-ec 6 0 B 0 26
>>>> TiB 0
>>>> default.rgw.buckets.index 7 0 B 0 26
>>>> TiB 603008
>>>>
>>>> - rgw_override_bucket_index_max_shards = 16. Clients are accessing RGW
>>>> via Swift, not S3.
>>>>
>>>> - the replication schema is EC 4+2.
>>>>
>>>> We are using this Ceph cluster as a secondary storage for another
>>>> storage infrastructure (which is more expensive) and we are offloading
>>>> cold data (big files with a low number of downloads/reads from our
>>>> customer). This way we can lower the TCO . So most of the files are big
>>>> ( a few GB at least).
>>>>
>>>> So far Ceph is doing well considering that I don't have big
>>>> expectations from current hardware. I'm a bit worried however that
we
>>>> have 78 M objects with max_shards=16 and we will probably reach 100M in
>>>> the next few months. Do I need a increase the max shards to ensure the
>>>> stability of the cluster ? I read that storing more than 1 M of objects
>>>> in a single bucket can lead to OSD's flapping or having io timeouts
>>>> during deep-scrub or even to have ODS's failures due to the leveldb
>>>> compacting all the time if we have a large number of DELETEs.
>>>>
>>>> Any advice would be appreciated.
>>>>
>>>>
>>>> Thank you,
>>>>
>>>> Adrian Nicolae
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list -- ceph-users(a)ceph.io
>>>> To unsubscribe send an email to ceph-users-leave(a)ceph.io