============================================
#opendaylight-clustering: clustering_hackers
============================================


Meeting started by moizer_ at 15:59:08 UTC.  The full logs are available
at
http://meetings.opendaylight.org/opendaylight-clustering/2015/clustering_hackers/opendaylight-clustering-clustering_hackers.2015-02-10-15.59.log.html
.


Meeting summary
---------------

* Gary Wu presenting information on Unified Secure Channel  (moizer_,
  16:06:55)
* wants to support call home like netconf call home  (moizer_, 16:07:26)
* device needs to make an inbound call to controller  (moizer_,
  16:07:39)
* device creates a call home connection  (moizer_, 16:08:23)
* this allows controller to talk to device  (moizer_, 16:08:57)
* assumptions that any node in cluster should be able to respond to a
  request instead of bouncing it around  (moizer_, 16:11:11)
* important that rpc request needs to be routed to the node with the
  connection  (moizer_, 16:11:58)
* Other considerations: scalability; should call home devices be
  “multi-homed” to multiple controller nodes  (tbachman, 16:12:29)
* moizer_ asks gwu if the idea is that the request to controller be
  bounced — is that so you don’t get a redirect?  (tbachman, 16:13:08)
* gwu says yes  (tbachman, 16:13:12)
* moizer_ says that the routed RPC mechanism should support this
  (tbachman, 16:14:18)
* uchau asks in the clustering model, what happens to an OF switch when
  taht node goes down; needs device ownership model so that the device
  can work with another node in the controller  (tbachman, 16:15:48)
* gwu says when a node goes down, the device needs to reconnect with one
  of the other nodes  (tbachman, 16:16:07)
* uchau asks if USC was going have openflow also go through the secure
  channel  (tbachman, 16:16:26)
* gwu says yes  (tbachman, 16:16:28)
* uchau is interested in developing a device ownership concept, which
  helps provide failover direction  (tbachman, 16:16:58)
* uchau says in this case, if of connects directly or through secure
  channel, the ownership model is the same  (tbachman, 16:17:32)
* gwu asks how openflow deals with multihoming/mastership?  (tbachman,
  16:17:43)
* uchau says the openflow team is implementing a message that allows the
  controller to assert the role  (tbachman, 16:17:58)
* uchau says that it can look at the device ownership when a device
  connects, and assert the role  (tbachman, 16:18:16)
* Helen says that clustering already has a supernode concept — asks if
  this is related  (tbachman, 16:19:33)
* moizer_ says for data, there is a concept of leaders and followers,
  but that does not mean you can go to another node to access inventory
  (tbachman, 16:20:53)
* Helen asks that w/o a load balancer, is it possible for clustering to
  solve this problem  (tbachman, 16:26:00)
* moizer_ recommends using virtual IPs for the controller  (tbachman,
  16:26:18)
* uchau says one option is to have the device connect to all the
  controllers in a team, which is similar to the openflow model
  (tbachman, 16:27:13)
* moizer_ says one problem with using a virtual IP and load balancing is
  how to do keep-alives  (tbachman, 16:30:28)
* gwu asks what the scalability is of that model — how many connections
  can a node handle  (tbachman, 16:30:58)
* uchau says that jmedved was maybe targeting 5k, but wasn  (tbachman,
  16:31:20)
* uchau says that jmedved was maybe targeting 5k, but wasn’t sure
  whether that was per-node or per-cluster  (tbachman, 16:31:51)
* Helen says that their requirement is for 1 million devices  (tbachman,
  16:32:04)
* moizer_ says with clustering, we can only store that we can fit into
  memory (i.e. storage can’t exceed the amount of memory available)
  (tbachman, 16:33:27)
* moizer_ says that’s a lot of operational data  (tbachman, 16:33:31)
* Helen says all the other data is stateless  (tbachman, 16:33:42)
* moizer_ says 1 million devices, and suspects that’s a lot of data in
  memory  (tbachman, 16:34:02)
* Fabiel Zuniga says that the persistence service may be able to help
  here  (tbachman, 16:34:49)
* markmozolewski says devices could maintain 1 Master / 1-2 Slave
  (backup) connections and establish new slave connections as failover
  occurs (vs. maintaining connections to all slaves), for cluster sizes
  >> 3.  (tbachman, 16:35:04)
* moizer_ recommends connecting a bunch of devices and see how things
  perform  (tbachman, 16:36:09)
* uchau asks if Helen wants the controller to support the load
  balancing, or using external load balancers  (tbachman, 16:37:32)
* uchau guesses that the 1 million nodes is to be supported by the
  cluster, not by a single node in the cluster  (tbachman, 16:37:57)
* moizer_ says with 64 switches in openflow, it takes about 4-1/2 MB in
  the data store  (tbachman, 16:39:18)
* I need to talk about bugs/patches for 10 mins  (moizer_, 16:39:38)
* catohornet asks with timeouts in the cluster — sees issue with many
  nodes, and where they’re configured topologically  (tbachman,
  16:40:08)
* moizer_ says you don’t need to have every node fully replicated; as an
  example, with routing logic and 5 cluster nodes, you might choose to
  do replication on only 3 of the nodes  (tbachman, 16:40:39)
* gwu asks if the proposal is workable  (tbachman, 16:42:16)
* moizer_ says yes  (tbachman, 16:42:18)
* gwu was thinking of presenting statistics to the MD-SAL (e.g. bytes
  transferred); asks about this (e.g. effects on data store as things
  scale)  (tbachman, 16:42:50)
* moizer_ says if stats colllection interval isn’t too low, then it
  should be okay (e.g. no client will be reading stats every 3 seconds)
  (tbachman, 16:43:26)


Meeting ended at 17:58:15 UTC.


People present (lines said)
---------------------------

* tbachman (54)
* moizer_ (13)
* odl_meetbot (3)
* markmozolewski (3)


Generated by `MeetBot`_ 0.1.4