Hi,
Is there a way to reinitialize the stored data and make it sync from the logs?
Thank you
________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
Hi Frederico,
On 04/02/2021 05:51, Federico Lucifredi wrote:
> Hi Loïc,
> I am intrigued, but am missing something: why not using RGW, and store the source code files as objects? RGW has native compression and can take care of that behind the scenes.
Excellent question!
>
> Is the desire to use RBD only due to minimum allocation sizes?
I *assume* that since RGW does have specific strategies to take advantage of the fact that objects are immutable and will never be removed:
* It will be slower to add artifacts in RGW than in an RBD image + index
* The metadata in RGW will be larger than an RBD image + index
However I have not verified this and if you have an opinion I'd love to hear it :-)
Cheers
>
> Best -F
>
>
> -- "'Problem' is a bleak word for challenge" - Richard Fish
> _________________________________________
> Federico Lucifredi
> Product Management Director, Ceph Storage Platform
> Red Hat
> A273 4F57 58C0 7FE8 838D 4F87 AEEB EC18 4A73 88AC
> redhat.com <http://redhat.com> TRIED. TESTED. TRUSTED.
>
>
> On Sat, Jan 30, 2021 at 10:01 AM Loïc Dachary <loic(a)dachary.org <mailto:loic@dachary.org>> wrote:
>
> Bonjour,
>
> In the context Software Heritage (a noble mission to preserve all source code)[0], artifacts have an average size of ~3KB and there are billions of them. They never change and are never deleted. To save space it would make sense to write them, one after the other, in an every growing RBD volume (more than 100TB). An index, located somewhere else, would record the offset and size of the artifacts in the volume.
>
> I wonder if someone already implemented this idea with success? And if not... does anyone see a reason why it would be a bad idea?
>
> Cheers
>
> [0] https://docs.softwareheritage.org/ <https://docs.softwareheritage.org/>
>
> --
> Loïc Dachary, Artisan Logiciel Libre
>
>
>
>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users(a)ceph.io <mailto:ceph-users@ceph.io>
> To unsubscribe send an email to ceph-users-leave(a)ceph.io <mailto:ceph-users-leave@ceph.io>
>
--
Loïc Dachary, Artisan Logiciel Libre
Hi.
We are trying to set up an NFS server using ceph which needs to be accessed by an IBM System i.
As far as I can tell the IBM System i only supports nfs v. 4.
Looking at the nfs-ganesha deployments it seems that these only support 4.1 or 4.2. I have tried editing the configuration file to support 4.0 and it seems to work.
Is there a reason than it currently only support 4.1 and 4.2?
I can of course edit the configuration file, but I would have to do that after any deployment or upgrade of the nfs servers.
Regards
Jens Hyllegaard
Hi Federico,
here I am not mixing raid1 with ceph. I am doing a comparison: is it safer
to have a server with raid1 disks or two servers with ceph and size=2
min_size=1 ?
We are talking about real world examples where a customer is buying a new
server and want to choose.
Il giorno gio 4 feb 2021 alle ore 05:52 Federico Lucifredi <
federico(a)redhat.com> ha scritto:
> Ciao Mario,
>
>
>> It is obvious and a bit paranoid because many servers on many customers
>> run
>> on raid1 and so you are saying: yeah you have two copies of the data but
>> you can broke both. Consider that in ceph recovery is automatic, with
>> raid1
>> some one must manually go to the customer and change disks. So ceph is
>> already an improvement in this case even with size=2. With size 3 and min
>> 2
>> it is a bigger improvement I know.
>>
>
> Generally speaking, users running Ceph at any scale do not use RAID to
> mirror their drives. They rely on data resiliency as delivered by Ceph
> (three replicas on HDD, two replicas on solid state media).
>
> It is expensive to run RAID underneath Ceph, and in some cases even
> counter-productive. We do use RAID controllers whenever we can because they
> are battery-backed and insure writes hit the local disk even on a power
> failure, but that is (ideally) the only case where you hear the words RAID
> and Ceph together.
>
> -- "'Problem' is a bleak word for challenge" - Richard Fish
> _________________________________________
> Federico Lucifredi
> Product Management Director, Ceph Storage Platform
> Red Hat
> A273 4F57 58C0 7FE8 838D 4F87 AEEB EC18 4A73 88AC
> redhat.com TRIED. TESTED. TRUSTED.
>
Hi,
There are multiple different procedures to replace an OSD.
What I want is to replace an OSD without PG remapping.
#1
I tried "orch osd rm --replace", which sets OSD reweight 0 and
status "destroyed". "orch osd rm status" shows "draining".
All PGs on this OSD are remapped. Checked "pg dump", can't find
this OSD any more.
1) Given [1], setting weight 0 seems better than setting reweight 0.
Is that right? If yes, should we change the behavior of "orch osd
rm --replace"?
2) "ceph status" doesn't show anything about OSD draining.
Is there any way to see the progress of draining?
Is there actually copy happening? The PG on this OSD is remapped
and copied to another OSD, right?
3) When OSD is replaced, there will be remapping and backfilling.
4) There is remapping in #2 and remapping again in #3.
I want to avoid it.
#2
Is there any procedure that doesn't mark OSD out (set reweight 0),
neither set weight 0, which should keep PG map unchanged, but just
warn about less redundancy (one out of 3 OSDs of PG is down), and
when OSD is replaced, no remapping, just data backfilling?
[1] https://ceph.com/geen-categorie/difference-between-ceph-osd-reweight-and-ce…
Thanks!
Tony
Bonjour,
In the context Software Heritage (a noble mission to preserve all source code)[0], artifacts have an average size of ~3KB and there are billions of them. They never change and are never deleted. To save space it would make sense to write them, one after the other, in an every growing RBD volume (more than 100TB). An index, located somewhere else, would record the offset and size of the artifacts in the volume.
I wonder if someone already implemented this idea with success? And if not... does anyone see a reason why it would be a bad idea?
Cheers
[0] https://docs.softwareheritage.org/
--
Loïc Dachary, Artisan Logiciel Libre
Hi,
We're looking forward to a few of the major bugfixes for breaking mgr
issues with larger clusters that are merged into the Octopus branch, as
well as the updated cheroot pushed to EPEL that should make it into the
next container build. It's been quite some time since the last (15.2.8)
release, and we were curious if there was an ETA on the 15.2.9 release
being cut, or at least a place to look and determine status. We checked the
docs and didn't see a way to gauge estimated release dates, so thought we'd
ask here.
Thanks!
Hello,
As we know, with 64k for bluestore_min_alloc_size_hdd (I'm only using
HDDs),
in certain conditions, especially with erasure coding,
there's a leak of space while writing objects smaller than 64k x k
(EC:k+m).
Every object is divided in k elements, written on different OSD.
My main use case is big (40TB) RBD images mounted as XFS filesystems on
Linux servers,
exposed to our backup software.
So, it's mainly big files.
My though, but I'd like some other point of view, is that I could deal
with the amplification by using bigger block sizes on my XFS
filesystems.
Instead of reducing bluestore_min_alloc_size_hdd on all OSDs.
What do you think ?
Hello, thank you for your response.
Erasure Coding gets better and we really cannot afford the storage
overhead of x3 replication.
Anyway, as I understand the problem, it is also present with
replication, just less amplified (blocks are not divided between OSDs,
just replicated fully).
Le 2021-02-02 16:50, Steven Pine a écrit :
> You are unlikely to avoid the space amplification bug by using larger
> block sizes. I honestly do not recommend using an EC pool, it is
> generally less performant and EC pools are not as well supported by
> the ceph development community.
>
> On Tue, Feb 2, 2021 at 5:11 AM Gilles Mocellin
> <gilles.mocellin(a)nuagelibre.org> wrote:
>
>> Hello,
>>
>> As we know, with 64k for bluestore_min_alloc_size_hdd (I'm only
>> using
>> HDDs),
>> in certain conditions, especially with erasure coding,
>> there's a leak of space while writing objects smaller than 64k x k
>> (EC:k+m).
>>
>> Every object is divided in k elements, written on different OSD.
>>
>> My main use case is big (40TB) RBD images mounted as XFS filesystems
>> on
>> Linux servers,
>> exposed to our backup software.
>> So, it's mainly big files.
>>
>> My though, but I'd like some other point of view, is that I could
>> deal
>> with the amplification by using bigger block sizes on my XFS
>> filesystems.
>> Instead of reducing bluestore_min_alloc_size_hdd on all OSDs.
>>
>> What do you think ?
>> _______________________________________________
>> ceph-users mailing list -- ceph-users(a)ceph.io
>> To unsubscribe send an email to ceph-users-leave(a)ceph.io
>
> --
>
> Steven Pine
>
> E steven.pine(a)webair.com | P 516.938.4100 x
>
> Webair | 501 Franklin Avenue Suite 200, Garden City NY, 11530
>
> webair.com [1]
>
> [2] [3] [4]
>
> NOTICE: This electronic mail message and all attachments transmitted
> with it are intended solely for the use of the addressee and may
> contain legally privileged proprietary and confidential information.
> If the reader of this message is not the intended recipient, or if you
> are an employee or agent responsible for delivering this message to
> the intended recipient, you are hereby notified that any
> dissemination, distribution, copying, or other use of this message or
> its attachments is strictly prohibited. If you have received this
> message in error, please notify the sender immediately by replying to
> this message and delete it from your computer.
>
>
>
> Links:
> ------
> [1] http://webair.com
> [2] https://www.facebook.com/WebairInc/
> [3] https://twitter.com/WebairInc
> [4] https://www.linkedin.com/company/webair
Hi,
After upgrading from 15.2.5 to 15.2.8, I see this health error.
Has anyone seen this? "ceph log last cephadm" doesn't show anything
about it. How can I trace it?
Thanks!
Tony