RBD is never a workable solution unless you want to pay the cost of
double-replication in both HDFS and Ceph.
I think the right approach is thinking about other implementation of the
FileSystem interface, like s3a and localfs.
s3a is straight forward, ceph rgw provide s3 interface and s3a is stable
and well tested in Hadoop ecosystem, just run it. There are a few other
in-house solution offered by some vendor that integrating librgw into the
s3a driver so it saves one extra hop and the management/LB cost of
maintaining an RGW cluster.
local filesystem is a bit tricky, we just tried a POC that mounting CephFS
into every hadoop , configure Hadoop using LocalFS with Replica = 1. Which
end up with each data only write once into cephfs and cephfs take care of
the data durability.
There was a libcephfs-jni but it is significantly out of date and seems
be abandoned, which is a pity.
For both solutions for sure you lost the locality , but trading for better
scalability and compute/storage separation.
-Xiaoxi
Marc Roos <M.Roos(a)f1-outsourcing.eu> 于2020年4月24日周五 下午4:00写道:
I think the idea behind pool size of 1, is that hadoop already writes
copies to 2 other pools(?).
However that leaves the possibility that pg's of these 3 pools can maybe
share an osd, and if that osd fails, you loose data in these pools. I
have no idea what the chances are that the same data of different pools
can end up on the same osd.
-----Original Message-----
To: ceph-users(a)ceph.io
Subject: [ceph-users] HBase/HDFS on Ceph/CephFS
Hi
We have an 3 year old Hadoop cluster - up for refresh - so it is time to
evaluate options. The "only" usecase is running an HBase installation
which is important for us and migrating out of HBase would be a hazzle.
Our Ceph usage has expanded and in general - we really like what we see.
Thus - Can this be "sanely" consolidated somehow? I have seen this:
https://docs.ceph.com/docs/jewel/cephfs/hadoop/
But it seem really-really bogus to me.
It recommends that you set:
pool 3 'hadoop1' rep size 1 min_size 1
Which would - if I understand correct - be disastrous. The Hadoop end
would replicated in 3 across - but within Ceph the replication would be
1.
The 1 replication in ceph means pulling the OSD node would "gaurantee"
the pg's to go inactive - which could be ok - but there is nothing
gauranteeing that the other Hadoop replicas are not served out of the
same OSD-node/pg? In which case - rebooting an OSD node would bring the
hadoop cluster unavailable.
Is anyone serving HBase out of Ceph - how does the stadck and
configuration look? If I went for 3 x replication in both Ceph and HDFS
then it would definately work, but 9x copies of the dataset is a bit
more than what looks feasible at the moment.
Thanks for your reflections/input.
Jesper
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io To unsubscribe send an
email to ceph-users-leave(a)ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users(a)ceph.io
To unsubscribe send an email to ceph-users-leave(a)ceph.io