I don't think you need a PaxosService for this. I think you can
maintain this metadata by using existing rados objects/mechanisms to
store the information directly in rados objects. Probably worth
researching how rgw and rbd already handle this.
-Sam
On Sun, Mar 28, 2021 at 6:42 PM Liu, Changcheng
<changcheng.liu(a)intel.com> wrote:
Update the framework diagram since the orginal diagram maybe not shown
correctly in the browser or some APP:
https://gist.github.com/changchengx/67694841d9559debf5bbf31a16d5bd0d
On 09:11 Fri 26 Mar, Liu, Changcheng wrote:
Hi all,
This email talks about how to design
1) ReplicaDaemon:
The daemon, running on the host with DCPMM & RNIC(RDMA-NIC),
reports what kind of info to Ceph/Monitor.
2) ReplicaMonitor:
ReplicaMonitor, one new PaxosService in Ceph/Monitor, manage the
ReplicaDaemons' info and deal with librbd's request to select
the appropriate ReplicaDaemons' info to librbd.
This email doesn't talk about:
After librbd get the ReplicaDaemons' info, how librbd will communite
with ReplicaDaemon and how to finish the replication.
RFC PR: [WIP] aggregate client state and route info
https://github.com/ceph/ceph/pull/37931
Detail:
+-----------------------------------+
+-----------------------------------------------+
|+---------------------------------+| |
+--------------------+|
|| ReplicaDaemonInfo: || | |PaxosServiceMessage
||
|| ||
|+---------------------------------------------+|
|| daemon_id; ||
||MReplicaDaemonBlink(MSG_REPLICADAEMON_BLINK):||
|| rnic_bind_port; || ||
||
|| rnic_addr; || ||ReplicaDaemonInfo;
||
|| free_size; ||
|+---------------------------------------------+|
|+---------------------------------+| |
+--------------------+|
|+---------------------------------+| | |PaxosServiceMessage
||
|| ReqReplicaDaemonInfo: ||
|+---------------------------------------------+|
|| ||
||MMonGetReplicaDaemonMap(CEPH_MSG_MON_GET_REPL||
|| replicas; || ||ICADAEMONMAP):
||
|| replica_size; || ||
||
|+---------------------------------+| ||ReqReplicaDaemonInfo;
||
|+---------------------------------+|
|+---------------------------------------------++
|| ReplicaDaemonMap: || |
+-------+|
|| || |
|Message||
|| std::vector<ReplicaDaemonInfo>; ||
|+---------------------------------------------+|
|+---------------------------------+|
||MReplicaDaemonMap(CEPH_MSG_REPLICADAEMON_MAP)||
| MetaData(need encode/decode) | ||
||
| | ||
||
| | ||ReplicaDaemonMap;
||
| |
|+---------------------------------------------+|
| | |
|
| | | Three messages defined for the MetaData
|
+-----------------------------------+
+-----------------------------------------------+
+--------+
+------------+
|Dispatch|
|PaxosService|
+---------------------+ Update ReplicaDaemonInfo
+---------------------------+
| ReplicaDaemon: | through | ReplicaMonitor:
|
| | MReplicaDaemonBlink |
|
| ReplicaDaemonInfo; -----------------------------------> ReplicaDaemonMap;
|
| | |
|
| ms_dispatch; | | //Need implement some
APIs|
+---------------------+
+------^-------------|------+
Request ReplicaDaemonMap Feedback
ReplicaDaemonMap
through |
|through
MMonGetReplicaDaemonMap
MReplicaDaemonMap
+------|-------------v------+
| librbd
|
+---------------------------+
ReplicaDaemon reports ReplicaDaemonInfo to ReplicaMonitor by MReplicaDaemonBlink
message.
ReplicaMonitor store all the ReplicaDaemonInfo into ReplicaDaemonMap after going
through Paxos.
The client(librbd) send MMonGetReplicaDaemonMap to ReplicaMonitor, ReplicaMonitor
will
choose the approprite ReplicaDaemon and pack all the info to new ReplicaDaemonMap to
send
back to the client by MReplicaDaemonMap message;
B.R.
Changcheng
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io
_______________________________________________
Dev mailing list -- dev(a)ceph.io
To unsubscribe send an email to dev-leave(a)ceph.io