On Fri, Dec 06, 2019 at 10:00:04AM +0100, Sebastien Han wrote:
Inline:
Thanks!
–––––––––
Sébastien Han
Senior Principal Software Engineer, Storage Architect
"Always give 100%. Unless you're giving blood."
On Fri, Dec 6, 2019 at 12:22 AM Kai Wagner <kwagner(a)suse.com> wrote:
On 05.12.19 10:33, Sebastien Han wrote:
Sorry, this has turned again into a
ceph-disk/ceph-volume discussion...
Yes because that's basically what you're doing here.
So what?
Personally I don't
see the value in just switching back again as I didn't found real good
and absolutely necessary reasons to do so and why rook and/or
ceph-volume couldn't be fixed in that regard. I also didn't find out why
the new solution should be any better in the case that we don't talk
about switching back in a few month again because all of those issues
are fixed now - You said it yourself "ultimately fixed" which at the
same time means that such tools need some time to reach the point were
all those issues are fixed. We need to start to talk across boundaries
and to reach out and involve other project maintainer to work on real
solutions instead of moving out of their way.
ceph-volume is a sunk cost!
And your argument basically falls into that paradigm, "oh we have
invested so much already, that we cannot stop and we should continue
even though this will only bring more trouble". Incapable of accepting
this sunk cost.
All the issues that have been fixed with a lot of pain.
All that pain could have been avoided if LVM wasn't there and pursuing
in that direction will only lead us to more pain again.
I don't think this was the argument. The intention is to at least try and fix
the code we have, instead of inventing something new yet again. I'm still not
sure what you mean with "a lot of pain". Nothing ever made it into the c-v
tracker to see if things can be improved. What are these painful issue, are they
so bad that we can not work on those as a community? I'm aware of
https://github.com/rook/rook/pull/3771 and
https://github.com/rook/rook/pull/4219. And again I'm more than happy to improve
the c-v side of things, I just can't when I'm not aware.
Just imagine we would switch back to partitions,
that would mean that
we're doing again a breaking change between releases. This change would
result in the same thing such like filestore->bluestore or
ceph-disk->ceph-volume - people have to rewrite the whole data of their
cluster at one point or another (or do we want to support both tools
forever?).
That's completely wrong. It's not because you change the tool to
provision OSDs that you must trash and re-create your OSD.
That's why ceph-volume has the "simple scan" command interface.
ceph-ansible has handled that migration pretty well transparently
without any breaking change.
So you propose user with lvm deployed OSDs can migrate without redeploying or
any service interruption? How would that work?
I have the feeling that we sometimes treat this
whole project
still just like a kick-starter project were everything can be switched
and changed between releases in any direction and whenever we think it
would be nice to do so. Could we please start to think about that
there's a big and luckily growing user base behind that are actively
relaying on this solution and they're not keen on moving their data
around just because we had a "feeling"?
I disagree, we are just talking about the provisioning tool, no
breaking changes involved and I don't even recall any major
re-architecture that ever introduced difficulties to users.
Every update usually introduces pain points for users. Bluestore was certainly
one. Sure no user has to migrate, only if they want the shiny new features.
No one should feel personally offended by the
last remark but
introducing such changes isn't something we could/should just do. We
really should have 100% valid arguments why replacing one tool by
another tool, which we had already in the past, now magically solves
everything and we're sure that we can't fix such issues in the current
solution.
100% is a myth, there is no such thing and no we didn't have that in
the past. Back then, I warned about what was coming with containers
and here we are...
Also, I'm not saying we should replace the tool but allow not using
LVM for a simple scenario to start with
Anyway, again, that discussion has gotten intense and we should
probably stop the storm here.
I'm not a big fan of the "ceph-volume activate" wrapper idea but
short-term that's probably the best thing we can do, I'll send an
e-mail for this shortly.
I'll also investigate not using ceph-volume in Rook when it comes to
collocating block/db/wal in the same disk for a bluestore OSD.
Thanks everyone for your replies, appreciate the inputs.
Kai
--
SUSE Software Solutions Germany GmbH, Maxfeldstr. 5, D 90409 Nürnberg
GF:Geschäftsführer: Felix Imendörffer, (HRB 36809, AG Nürnberg)
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io
--
Jan Fajerski
Senior Software Engineer Enterprise Storage
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer