#opnfv-pharos log

14:00:13 <fdegir> #startmeeting Cross Community CI
14:00:13 <collabot> Meeting started Wed Jan 24 14:00:13 2018 UTC.  The chair is fdegir. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:13 <collabot> Useful Commands: #action #agreed #help #info #idea #link #topic.
14:00:13 <collabot> The meeting name has been set to 'cross_community_ci'
14:00:25 <fdegir> anyone around for xci meeting?
14:00:28 <fdegir> #topic Rollcall
14:00:48 <fdegir> agenda is on: https://etherpad.opnfv.org/p/xci-meetings
14:00:52 <hw_wutianwei> hi
14:01:06 <fdegir> hi hw_wutianwei
14:01:12 <david_Orange> hi
14:01:13 <hw_wutianwei> #info Tianwei Wu
14:01:17 <david_Orange> #info David Blaisonneau
14:01:24 <fdegir> david_Orange: hi
14:01:28 <mardim> #info Dimitrios Markou
14:01:37 <fdegir> let's start
14:01:43 <fdegir> and others join on the way
14:01:45 <hwoarang> #info Markos Chandras
14:01:56 <fdegir> #topic Scenario/Feature Status: Kubernetes in XCI
14:02:07 <fdegir> hw_wutianwei: the patch is almost there to be merged
14:02:14 <hw_wutianwei> yes
14:02:15 <fdegir> hw_wutianwei: anything you'd like to add?
14:02:29 <joekidder> #info Joe Kidder (sorry late for roll call)
14:02:30 <hw_wutianwei> I see comment of  hwoarang i will update
14:02:50 <hw_wutianwei> and after patch is megered, I would try to install K8s on centos.
14:02:51 <fdegir> #info Patch is very close to be submitted. Everyone is urged to review and see if you have any input.
14:02:59 <fdegir> #link https://gerrit.opnfv.org/gerrit/#/c/50213/
14:03:31 <fdegir> #info Only ubuntu support will be available at the beginning. Centos work will start once the patch gets merged.
14:03:45 <fdegir> thanks hw_wutianwei for bringing k8s into xci!
14:03:54 <hw_wutianwei> that's all
14:03:54 <jmorgan1> #info Jack Morgan
14:04:00 <mbuil> #info mbuil
14:04:03 <fdegir> #topic Scenario/Feature Status: Congress in XCI
14:04:05 <mbuil> #info Manuel Buil
14:04:12 <fdegir> taseer1: taseer2: around?
14:05:00 <fdegir> #info Final piece is under development/review and will hopefully be done soon.
14:05:09 <fdegir> #link https://review.openstack.org/#/c/522491/
14:05:34 <fdegir> #topic Scenario/Feature Status: Blazar in XCI
14:05:50 <fdegir> #info The blueprint submitted by taseer2 has been merged.
14:05:58 <fdegir> #link https://review.openstack.org/#/c/528567/
14:06:17 <fdegir> #info Taseer also started working on the role so more things will start appearing.
14:06:35 <fdegir> #info This work is done with the help from OPNFV Promise team and their comments are incorporated.
14:06:56 <fdegir> #topic Scenario/Feature Status: os-odl-sfc
14:07:05 <fdegir> mbuil: mardim: any updates?
14:07:18 <mbuil> #info The proposed SHAs are breaking SFC scenario because of a bug in the neutron-odl driver. Discussions on-going with that community to fix it. It is not a critical bug
14:07:22 <mardim> nothing from me
14:08:03 <fdegir> mbuil: I suppose this issue is valid for os-odl-nofeature scenario as well?
14:08:07 <mbuil> I think that's it... should we touch upon the baremetal deployment here?
14:08:32 <fdegir> mbuil: will come to that in a minute after some more info about this bug
14:08:34 <mbuil> fdegir: yes! It is not a critical problem because it appears when deleting a security group if security rules are deleted before (.e.g. this is how SNAPs does it)
14:08:56 <mbuil> there is a race condition
14:09:04 <fdegir> mbuil: the reason I am asking this is that when we start functest, this issue might become blocker for patchset verification and post merge jobs
14:09:27 <fdegir> mbuil: so how this will be handled?
14:09:34 <fdegir> snaps fixing it or upstream fixing it or?
14:09:38 <mbuil> fdegir: A workaround is deleting the security group without deleting the security rules, which results in the same state at the end
14:09:49 <fdegir> mbuil: ok, where this workaround will be?
14:10:27 <mbuil> That workaround should happen at the scenario level. However, the current functest healthcheck has a testcase which tests the deletion of rules and security groups
14:11:27 <electrocucaracha> #info Victor Morales
14:11:45 <mbuil> We are going to change the SFC scenario so that it does not delete the rules but deletes the security group directly because both ways are fine for us. However, we should consider what to do with the functest healthcheck
14:12:01 <fdegir> this is what I was after
14:12:10 <mbuil> Upstream has already two patches to fix the issue, so it might be fix pretty quickly
14:12:16 <mbuil> *fixed
14:12:22 <fdegir> let me info these in
14:12:33 <fdegir> mbuil: and please put the links to patches/jira tickets or whatever you have
14:13:20 <mbuil> fdegir: ok, please info that in and then I paste the linkgs
14:13:22 <mbuil> links
14:13:25 <fdegir> #info The bug in neutron-odl driver impacts os-odl-nofeature scenario as well but the bug is not critical since it appears when deleting a security group if security rules are deleted before (.e.g. this is how SNAPs does it) - race condition
14:13:32 <fdegir> #info A workaround is deleting the security group without deleting the security rules, which results in the same state at the end
14:13:40 <hwoarang> hmm the newtorking-odl fixes will have to come to upstream OSA first and then to use via tha a-r-r bump so it will take quite a bit
14:13:44 <mbuil> #link https://review.openstack.org/#/c/536935/
14:13:50 <fdegir> #info That workaround should happen at the scenario level. However, the current functest healthcheck has a testcase which tests the deletion of rules and security groups
14:13:53 <mbuil> #link https://review.openstack.org/#/c/533706/
14:14:03 <fdegir> #info Upstream has already two patches to fix the issue, so it might be fixed pretty quickly. See the links.
14:14:22 <mbuil> Both committers are now discussing which approach is best
14:14:22 <fdegir> mbuil: so what about what hwoarang says?
14:14:40 <fdegir> mbuil: should we have this workaround applied on functest/snaps until the real fix lands?
14:14:45 <mbuil> hwoarang, fdegir: Yes
14:15:02 <fdegir> mbuil: so not just xci/sfc but whoever has the issue can use the workaround
14:15:21 <fdegir> mbuil: can you submit a jira ticket to functest or snaps and see what they say?
14:15:25 <hwoarang> fyi the queens release is this weeks right?
14:15:27 <mbuil> fdegir: We should discuss that with Steve. I suggest whitelisting that test so that we are aware the problem exists but it does not block us
14:15:44 <hwoarang> so the fixes may not make it in the initial release :(
14:15:51 <fdegir> maybe this info is relevant for dmcbride as well
14:16:05 <mardim> hwoarang, I think this week is the code freeze
14:16:09 <mardim> but not sure
14:16:24 <hwoarang> ah yeah
14:16:26 <fdegir> mbuil: can you create the jira ticket on snaps so we have a record
14:16:35 <fdegir> and then we follow it up through that
14:16:54 <mbuil> fdegir: yes sir
14:16:57 <fdegir> it is fine if the ticket gets closed with no action
14:16:59 <fdegir> thanks mbuil
14:17:14 <fdegir> so moving to CI side of things for os-odl-sfc
14:17:28 <fdegir> #info Post-merge jobs for sfc have been created, running virtual deployments
14:17:46 <fdegir> #info Work to incorporate functest is in progress as well
14:18:20 <fdegir> #info Based on what we have available at the moment (only deployment and no functest), sfc has been promoted to baremetal which mbuil started working on it
14:18:26 <fdegir> mbuil: so, baremetal?
14:20:09 <mbuil> fdegir: I was given the task to try SFC in baremetal using the patches that provide that functionality. However, it would be fantastic if somebody could explain me how to combine those patches and described me the steps about how to do that
14:20:25 <fdegir> david_Orange: ^
14:20:33 <fdegir> david_Orange: can you help mbuil to try your patches?
14:20:54 <fdegir> david_Orange: so the work with your patches and on baremetal progress together
14:20:58 <david_Orange> fdegir, lets talk about that on stabilization topic, but of course
14:21:09 <fdegir> david_Orange: thanks David
14:21:26 <fdegir> #info mbuil started working with sfc on baremetal. LF POD4 is where the work is going on
14:21:34 <fdegir> #info mbuil and david_Orange will work together on this
14:21:37 <fdegir> ok, moving on
14:21:44 <mbuil> david_Orange: Any info about how could I deploy xci in baremetal would be really appreciated. We could combine anything you have + experience and create a wiki page for example
14:22:06 <fdegir> #topic Scenario/Feature Status: os-odl-bgpvpn
14:22:22 <fdegir> #topic Scenario/Feature Status: os-odl-bgpvpn
14:22:44 <fdegir> #info epalper submitted a blueprint to osa for bgpvpn and the blueprint has been accepted/merged
14:22:53 <fdegir> #link https://review.openstack.org/#/c/523171/
14:23:23 <fdegir> #info The work should start soon both upstream and in opnfv bgpvpn
14:23:44 <fdegir> #topic Scenario/Feature Status: Masakari in XCI
14:24:03 <fdegir> #info The Masakari project team will start working on OSA blueprint and roles upstream once Queens is released
14:24:12 <fdegir> #info We should start seeing some progress there soon
14:24:30 <fdegir> #topic General Framework Updates: Centos Support
14:24:43 <fdegir> #info The patches introducing Centos support have been merged
14:25:03 <fdegir> #info os-nosdn-nofeature scenario now works on all 3 distros we have: ubuntu, opensuse, and centos
14:25:36 <fdegir> #topic General Framework Updates: Improving Stability
14:25:44 <fdegir> david_Orange: your turn
14:25:55 <david_Orange> fdegir, sure
14:26:37 <david_Orange> i worked on stabilisation and integrate a major remark, the PDF was not the official
14:27:17 <david_Orange> i badly tought the proposition i made were incorporated
14:27:41 <david_Orange> so i worked on that, and simplify the ip distribution on nodes
14:28:10 <fdegir> #info The PDF david_Orange used during his work wasn't the official one and his proposal wasn't incorporated to official version
14:28:26 <david_Orange> to keep it simple and use fixed ips in the pdf and not incremented by the code
14:29:06 <david_Orange> i also faced an issue on virtualBMC during the deploy from bifrost
14:29:43 <david_Orange> i supposed the burst at VM starting is too important for vbmc+virsh so i deploy node by node
14:30:20 <fdegir> david_Orange: sorry, just a moment
14:30:30 <david_Orange> so i am now testing the whole deploy, including OSA, and not only the infra level
14:30:32 <david_Orange> fdegir, yes
14:30:50 <fdegir> david_Orange: do you mean if the provisioning is done concurrently, you had problem
14:30:59 <fdegir> and you needed to do it sequentially?
14:31:04 <david_Orange> yes
14:31:30 <david_Orange> fdegir, but i set the bifrost option wait_for_node to false
14:31:45 <david_Orange> fdegir, so it is not a long process
14:32:05 <fdegir> david_Orange: do you have some data regarding what is the difference in time when done this way?
14:32:12 <fdegir> like few minutes or 10+ or ?
14:32:47 <david_Orange> fdegir, no, but it is less than a few minutes
14:33:20 <fdegir> david_Orange: ok, that should be fine until we find time to look into root cause more closely
14:33:47 <david_Orange> fdegir, with the wait_for_node_deploy: false it does not wait pxe process end
14:33:59 <fdegir> #info The provisioning is done sequentially so nodes are deployed one by one
14:34:09 <fdegir> #info with the wait_for_node_deploy: false it does not wait pxe process end
14:34:37 <david_Orange> fdegir, the issue was that on a 3 nodes deploy there were always one or more with a failure on IPMI start
14:34:49 <david_Orange> fdegir, but never when i test it manually
14:34:59 <david_Orange> fdegir, and of course never on the same node :)
14:35:30 <fdegir> david_Orange: good to know
14:35:31 <david_Orange> fdegir, the longer part for now is the image build
14:35:55 <fdegir> david_Orange: I think we intend to use pre-built images to cut that time short
14:36:04 <david_Orange> fdegir, 3 days on it going to tcpdump the ipmi, and it is the only workaround that worked :(
14:36:10 <fdegir> in fact, we are doing that for virtual deployments if I', not mistaken
14:36:12 <fdegir> hwoarang: ^ ?
14:36:27 <hwoarang> yep
14:36:27 <fdegir> hwoarang: we are using prebuilt images, isn't it?
14:36:51 <fdegir> david_Orange: so the provisioning working on baremetal and you are working on osa parts now
14:37:00 <fdegir> is my understanding right?
14:37:03 <hwoarang> eh for the 'clean vm' not for the XCI vms
14:37:12 <hwoarang> for XCI vms we always build with diskimage-builder
14:37:16 <david_Orange> fdegir, that is what is done today, but when i tried it it failed, without time to debug
14:37:24 <fdegir> david_Orange: ok
14:37:49 <fdegir> david_Orange: but I think this part (provisioning) is soemthing you can help mbuil so he gets it working on lf-pod4 while you work with osa parts
14:38:24 <david_Orange> fdegir, the provisioning is working on VM, i will test if for baremetal asap
14:38:38 <fdegir> david_Orange: thanks
14:38:40 <fdegir> david_Orange: anything else?
14:38:49 <david_Orange> fdegir, then test it on lfpod4 and long duration pod
14:39:15 <david_Orange> i will push the updated code when tested on BM
14:39:35 <fdegir> +1
14:39:37 <david_Orange> fdegir, nothing more on that
14:40:19 <fdegir> thanks david_Orange
14:40:41 <fdegir> #topic XCI update during Weekly Technical Discussion
14:41:04 <fdegir> #info I requested time from Bin to give an update to the community regarding our progress with XCI
14:41:17 <mbuil> fdegir: what community?
14:41:26 <fdegir> mbuil: opnfv community :)
14:41:29 <mbuil> oh ok
14:41:46 <fdegir> #info Please join to it on February 1st and share your experiences/thoughts and help me explaining things/answering questions
14:42:06 <fdegir> #link https://wiki.opnfv.org/display/PROJ/Weekly+Technical+Discussion
14:42:19 <fdegir> #info Next week's agenda should appear in few days
14:42:42 <fdegir> #topic Cross Community Infra/CI Workshop
14:43:05 <fdegir> #info We have been working on arranging a cross community infra/ci workshop together with other communities
14:43:44 <fdegir> #info And the interest to it is great; we will probably get participation from 7 communities; opnfv, odl, openstack, onap, cncf, fd.io, ansible
14:44:38 <david_Orange> great
14:44:54 <fdegir> #info This is thanks to the work you've all been doing since we as XCI are seen as an important actor in cross community collaboration area
14:45:13 <mbuil> fdegir: are the dates and venue already settled?
14:45:40 <fdegir> #info We are having some trouble with the logistics but we hope to have it sorted out
14:45:56 <fdegir> #info More info will be shared once things are finalized
14:46:12 <fdegir> mbuil: nope :/
14:46:24 <fdegir> #topic AoB
14:46:29 <fdegir> anyone wants to bring anything up?
14:46:57 <david_Orange> fdegir, kolla ?
14:47:03 <fdegir> #topic Kolla in XCI
14:47:08 <david_Orange> fdegir, just a few words
14:47:08 <fdegir> david_Orange: thanks for the reminder
14:47:17 <fdegir> david_Orange: any progress about ending the Kolla wars?
14:47:31 <david_Orange> fdegir, we had a small irc session with electrocucaracha and randyl
14:48:01 <david_Orange> fdegir, it seems that we are all on the same kolla basis, using the official documentation
14:48:49 <david_Orange> i propose to take the opportunity to use PDF/IDF directly for this new installer
14:49:25 <fdegir> david_Orange: ok, so you will work with upstream/vanilla kolla and incorporate pdf/idf into it while you work on bringing it into XCI?
14:49:57 <david_Orange> i also propose to prepare an enhanced inventory for this installer that can be used by other later, to avoid too much PDF analysis at installer level
14:50:19 <david_Orange> fdegir, that was the proposition
14:50:38 <fdegir> david_Orange: may I ask you to come up with a high level proposal explaining what you intend to do and present it to the team before doing anything?
14:50:39 <david_Orange> but we did not get the time to finish the talk
14:51:01 <fdegir> david_Orange: we can have 30 minutes during one of the upcoming meetings so the team knows more about this
14:51:02 <david_Orange> fdegir, sure
14:51:23 <fdegir> david_Orange: ok, just ping me when you are ready and we add it to the agenda
14:51:46 <david_Orange> fdegir, i will keep you in touch, i am busy this week, not sure i can do that for next meeting
14:51:53 <fdegir> david_Orange: that's fine
14:51:55 <david_Orange> fdegir, great
14:52:22 <fdegir> #info david_Orange, electrocucaracha, and randyl had conversation around the topic
14:52:41 <fdegir> #info The work is planned to be done with upstream/vanilla kolla and pdf/idf will be incorporated into it
14:52:59 <fdegir> #info david_Orange will come up with a high level overview/plan and present it to team to collect feedback
14:53:02 <david_Orange> about that, a technical question
14:53:08 <fdegir> yes
14:53:32 <david_Orange> we do not need br-adm, br-storage and other now with ovs, yo can confirm ?
14:53:42 <david_Orange> s/yo/you
14:53:56 <fdegir> I can't unfortunately - I'm not good at these things
14:54:02 <fdegir> anyone else?
14:54:10 <fdegir> mbuil: mardim: hwoarang: ^
14:54:19 <david_Orange> this is done using ovs, not linux bridge anymore if i am not wrong
14:54:37 <hwoarang> i am not sure how ovs works with OSA to be honest
14:54:39 <david_Orange> this will impact the node network preparation step
14:54:42 <fdegir> yes, we switched to ovs
14:54:54 <fdegir> david_Orange: epalper was the one who integrated ovs
14:55:06 <fdegir> david_Orange: I suggest you to tlak to him when it comes to this
14:55:12 <david_Orange> ok
14:55:36 <fdegir> I'll ping epalper and tell him to ping you when he's online
14:55:40 <fdegir> he was sick this week
14:55:49 <mbuil> I am not using br-adm and br-storage in the SFC scenario
14:55:53 <david_Orange> fdegir, thx
14:56:05 <mbuil> david_Orange: do you see those when deploying xci?
14:56:09 <david_Orange> kolla is not using them too
14:56:46 <fdegir> I think we can end the meeting now if noone brings up a new topic in a minute
14:56:47 <mardim> david_Orange, I think we need them
14:56:57 <david_Orange> in the official documented OSA way, those 4 bridges are required
14:57:16 <mardim> david_Orange, because those are linux bridges which are related to the containers
14:57:17 <fdegir> thanks all and talk to you next week!
14:57:18 <hw_wutianwei> david_Orange: in my opinion, the ovs bridge are set on neutron-agent lxc when integrating ovs.
14:57:20 <fdegir> #endmeeting