Sage Weil <sage@newdream.net> wrote on 18.07.2019 17:00:18:
> On Thu, 18 Jul 2019, Ulrich Weigand wrote:
> > > I think the cleanest thing is to s/__le/ceph_le/ everywhere *except*
> > > msgr.h, rados.h, and ceph_fs.h.  It seems like what happened was we
> > > mostly forgot to always use the ceph_ variant after 2010.
> > >
> > > That should resolve the issue, right?
> >
> > That's what we've tried to do, and it does indeed make work Ceph a lot
> > better on IBM Z (still some issues, but probably unrelated).  Once we've
> > gotten a bit farther with testing, we can certainly send patches to
> > that effect.
>
> Great, thanks!

There's a few places where this approach causes problems because __le32
etc. are used for objects that are being operated on.  The ceph_le32 etc.
types really only allow assigning from and to objects of the type, not
any other operations.  (Not even initialization, which is a bit weird
in some places -- should there be a constructor defined?)

In most cases I was able to resolve those issues in a straightforward
way (e.g. for a local variable we should just use __u32 instead of __le32).
However, one file makes widespread use of this: os/Transaction.h.
I'm not really sure how to fix this -- are the structures in this file
intended to be local in-memory objects (then why are they using __leXX?),
or are they intended to be on-disk/network serialized forms (then why are
there lots of arithmetic and other operations performed directly on those
types?).


> > However, where I'm still unsure is that there appear to be a number of
> > files that e.g. just #include "ceph_fs.h" directly, without going
> > through the re-define in types.h ...
> >
> > Now maybe this doesn't matter because none of those places actually
> > use any of the structs that use __le32 from those headers, but that
> > seems hard to verify (and a bit fragile ...).
> >
> > Would the correct fix for this be to replace the direct includes
> > with includes of types.h?  Or should those headers be changed to be
> > more self-contained and always define __le32 themselves when built
> > in user space (but I'm not quite sure how to do that without
> > conflicting with the __le32 definition in linux/types.h ...)?
> >
> > In any case, it would be really good to find a way to have those
> > rules (e.g. never use __le32, never include ceph_fs.h directly)
> > automatically enforced at build time, to avoid any reoccurance
> > of the problem with future code changes.
>
> Yeah, agree.
>
> I think the best option would be to make those headers have a bit at the
> top that ensure thigns are correctly defined for userspace.  We don't want
> that block to leak into the kernel tree, but it's easy enough to ignore a
> single block of code; much harder to keep them in sync if we were to, say,
> s/__le32/ceph_le32/ in the headers.

It does indeed look like by adding something like
#ifndef __KERNEL__
#include "byteorder.h"
#define __le16 ceph_le16
#define __le32 ceph_le32
#define __le64 ceph_le64
#endif
at the beginning and the corresponding
#ifndef __KERNEL__
#undef __le16
#undef __le32
#undef __le64
#endif
at the end of those files.

Bye,
Ulrich