13:01:04 #startmeeting Cross Community CI 13:01:04 Meeting started Wed Aug 8 13:01:04 2018 UTC. The chair is fdegir. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:01:04 Useful Commands: #action #agreed #help #info #idea #link #topic. 13:01:04 The meeting name has been set to 'cross_community_ci' 13:01:29 #topic Rollcall 13:01:33 hwoarang: yes! Would it be possible to wait 2-3 weeks? 13:01:39 #info Markos Chandras 13:01:40 mbuil: ok 13:01:44 #info Jack Morgan 13:01:46 os you want to try new images for the fdgir problem 13:01:49 *or 13:01:54 #info Manuel Buil 13:01:56 here is the agenda on its usual place: https://etherpad.opnfv.org/p/xci-meetings 13:01:58 i can wait 13:02:24 and the first topic is the issue with the slaves 13:02:32 #topic Issues with Jenkins Slaves 13:02:56 so I went against the motto if it's working don't touch it and broken things 13:03:08 the opnfv vm can't boot 13:03:11 fdegir: I am attending another meeting in parallel, so I might be slow :P 13:03:45 i tried 3 different kernel versions: 4.4.0-112, 4.4.0-122, 4.4.0-131 13:03:50 #info VĂ­ctor Morales 13:04:25 one of the 112 and 122 are probably the kernel we used earlier until this morning's update 13:04:38 what else we should look? 13:05:00 fdegir: i just deployed yesterday without problem, opnfv host booted 13:05:01 or if one of you want to take a look, the problematic deployment is available on intel-pod16-node3 13:05:12 jmorgan1: is it ubuntu and what's the kernel? 13:05:28 apart from that, when did you do apt update && apt upgrade ? 13:05:47 by opnfv host, we mean ubuntu_xci_vm or opnfv vm inside ubuntu_xci_vm 13:05:55 opnfv vm inside ubuntu_xci_vm 13:06:01 fdegir: yes, its ubuntu 13:06:12 its up and running for me 13:06:19 kernel 128 i beleive 13:06:29 ok 13:06:46 can you do apt update && apt upgrade, recreate ubuntu_xci_vm and run xci-deploy.sh again? 13:06:47 fdegir: does an opensuse vm work/ 13:06:58 to rule out probs with the VM itself? 13:07:09 i had other issues related to k8-calico-onap scenario, but the VM was running 13:07:14 hwoarang: nope 13:07:20 i mean, if this was triggered by a job, then one of the nodes will have opensuse running 13:07:32 on the same ubuntu xenial host configuration 13:07:47 hwoarang: https://build.opnfv.org/ci/job/xci-verify-opensuse-deploy-virtual-master/1855/console 13:07:59 hwoarang: https://build.opnfv.org/ci/job/xci-verify-ubuntu-deploy-virtual-master/1854/console 13:08:05 ok so it's a host thing 13:08:07 same error but it fails faster on opensuse ;) 13:08:17 no point in updating the images then 13:08:23 i don't think so 13:08:33 good 13:08:41 is there a way to roll back the complete apt upgrade someway? 13:09:14 or find what kernel version the machine was booted earlier? 13:09:23 where are the jenkins slaves? 13:09:24 journalctl --list-boots ? 13:09:31 cause i did all autoremove to get rid of old kernels as well 13:09:41 and then look at the logs for the boot you want with journalctl -b 13:09:59 there is only 1 boot entry 13:10:02 :( 13:11:25 fdegir: can you access the VM console? 13:13:38 no 13:14:19 does virsh say the VM is paused? 13:14:24 it's running 13:15:02 then what is the problen? the vm fails during the boot process? 13:15:17 but does it matter? it's panicked 13:15:17 http://paste.ubuntu.com/p/5BW7mP3SF5/ 13:15:48 let's move on to the next topics 13:16:09 #topic OSA Shabump 13:16:16 this is problematic as well 13:16:26 i am looking into that 13:16:31 we need newer bifrost 13:16:31 ok 13:16:41 i bumped to the latest bifrost 13:16:45 to Manuel's patch 13:16:56 there is one more commit we are missing 13:17:17 Merged pharos: [idf.fuel] Add jumpserver.trunks for mgmt https://gerrit.opnfv.org/gerrit/60743 13:17:47 fdegir: the baremetal patch? 13:18:20 mbuil: i don't remember what was it exactly but after seeing you committed something to bifrost, i tried that one 13:18:29 hwoarang: ok 13:18:33 no we really need the HEAD of bifrost 13:19:01 but we will see 13:19:18 ok 13:19:40 the reason why bump is urgent is that I expect the projects will start arriving soon 13:19:45 for masakari and blazar 13:19:58 that's all about shabump 13:20:09 #topic Functest Issues 13:20:15 this is another tricky topic 13:20:28 singlevm test case times out 13:20:32 that is more urgent i think 13:20:53 i agree 13:21:08 i had a talk with cedric in the past few weeks and he basically said that nested virt is not proper CI and functest is not designed for such scenarios 13:21:15 well 13:21:20 so to me the question might be whether we want to keep functest on this level 13:21:23 i probably shouldn't comment on it 13:21:35 what if we were using openstack cloud for patch gating 13:21:38 maybe we should only use it for smoke + baremetal 13:21:40 getting slaves as vms 13:22:09 and if upstream openstack is doing tempest using virtual machines then what we are doing is not that strange either 13:22:15 but again 13:22:21 true but we have one more level of virtualization 13:22:33 i think it is same 13:22:50 doesn't openstack ci get slaves from openstack cloud using nodepool? 13:23:01 i think so 13:23:03 yes but we also have clean vm 13:23:09 ok 13:23:18 so openstack gets installed on that nodepool vms directly 13:23:36 then right 13:24:02 i'm wondering how it would perform if we run functest against an xci deployment done directly on host 13:25:15 anyway 13:25:19 the prob is that if your CI scenario is not considered supported by functets, then we have 0 help from them 13:25:26 *s/your/our 13:25:37 so, not sure if we can keep up with that 13:26:29 i really shouldn't say anything 13:26:36 i failed to explain what we are doing and why 13:26:50 either way 13:26:56 we would need one of us to join functest to gain the knowledge 13:27:03 it is not about knowledge 13:27:10 ok 13:27:11 moving on 13:27:43 when we meet next time over a beer, I'll give you details 13:27:49 we have knowledge to fix stuff. i am submitting patches to fix our cases but i get resistance because of our unsupported case and that's the prob 13:28:03 wait, because we haven't decided anything and this will block things on SHA bump 13:28:12 nope 13:28:26 we haven't switched back to latest version of functest yet 13:28:32 we are still using the pinned version 13:28:37 ok then 13:28:38 :/ 13:28:41 we bump against that 13:28:49 and then come back to uplifting functest 13:28:57 because 13:29:03 anyway 13:29:07 #topic Baremetal Status 13:29:16 mbuil: so, you got it working 13:29:55 mbuil: when do you think we can review the change? 13:31:06 mbuil seems to be focused on his other meeting 13:31:10 fdegir: back 13:31:17 I had to talk in te other one 13:31:26 no multitasking ;( 13:32:11 so, baremetal patch works in ubuntu! I deployed mini flavor several times with SFC- scenario in Ericsson POD2 13:32:36 However, the patch has several hardcoded things and I am currently fixing those 13:32:57 AFter that, I'll split the patch into several patches so that it is easily consumable for reviewers 13:33:04 +1 13:33:20 it would be nice to get a POD where the jumphost uses opensuse 13:33:26 because currently I cannot test it 13:33:51 what do you mean? 13:34:00 why should it matter? 13:34:48 LFPOD4 and Ericsson POD2 have Ubuntu in the jumphost. The code is currently using that information to decide what distro to install in OPNFV and in the blades 13:35:02 mbuil: export DISTRO=opensuse before xci-deploy.sh 13:35:02 the jump server is running the opnfv host vm which is deploying to BM nodes 13:35:11 Are you suggesting to define the distro with e.g. a env variable? 13:35:14 yes 13:35:18 that's what we do in ci 13:35:28 oh, that is new to me 13:35:34 then ignore my request 13:35:35 all slaves are ubuntu but we are deciding what distro to bring the nodes up 13:35:57 hwoarang: correct me if I'm wrong 13:36:31 mbuil: we are looking forward to split up patches 13:36:34 i'm not sure what we are saying here 13:36:39 thanks for the great work 13:36:55 jump server with any OS running the opnfv host VM which deploys to actual BM nodes 13:37:05 jmorgan1: you are not bound by the os of the jumphost 13:37:19 jmorgan1: by default, the host os is chosen as the os for target nodes 13:37:20 fdegir: right, that is what I'm saying 13:37:23 jmorgan1: by jumphost I meant the host which hosts the opnfv vm 13:37:37 jmorgan1: but you can change the target node os 13:37:45 mbuil: yes, this is what OPNFV defines as jumphost 13:38:20 jmorgan1 the default behaviour is: if jumphost has distro A, opnfv VM gets distro A and compute and controller get distro A 13:38:25 agreed, otherwise, how are you going to know what OS to deploy to nodes 13:38:34 by setting DISTRO var 13:38:50 which is what xci actually does behind the scenes if DISTRO is not set 13:39:07 ok, i think we are all on the same page 13:39:07 and this was mbuil's question 13:39:27 we don't case which OS is on jump host 13:39:39 we support threee OS on opnfv host vm 13:39:40 yes, that's it 13:39:51 which deploys three OS to BM nodes 13:40:16 anything else mbuil? 13:40:27 so mbuil should be able to use opensuse based opnfv VM to test 13:40:50 nothing else. I am pretty busy right now but I hope to give 50% of my time to this 13:41:04 thanks mbuil 13:41:12 #topic k8-calico-onap 13:41:32 jmorgan1: electrocucaracha: how is it going? 13:42:11 fdegir: well, jmorgan1 has been facing the nested virtualization issues 13:42:24 fdegir: trying to test the new scenario 13:42:51 fdegir: we have doubts about the proper syntax for adding a new scenario using gerrit as a source 13:43:32 fdegir: jmorgan1 has a draft for that, we'll need some help on that 13:43:44 electrocucaracha: jmorgan1: https://gerrit.opnfv.org/gerrit/#/c/58945/17/xci/opnfv-scenario-requirements.yml 13:43:59 the version matches to sha of your commit 13:44:14 and the refspec matches to refspec of your change on gerrit 13:44:18 we also ran into an issue where we needed to update role (create-vm-nodes) to remove python-libvirt / install virtualbmc 13:45:01 i think this issue is resolved 13:45:26 but we didn't find any patch solving that issue 13:45:26 fdegir: progress is coming along 13:45:58 create-vm-nodes was the 1st problem we had to solve 13:46:23 but now that issue is not there anymore? 13:46:28 next we are getting opnfv-scenario-requirements.yml working 13:46:34 right 13:46:36 ok 13:46:45 might have been fixed upstream 13:47:03 when installing virtualbmc playbook, package names changed 13:47:18 from libvirt-python to python-libvirt 13:47:49 we tested in ubuntu distro 13:47:51 we created a patch to the playbook to remove the old package first, then install new one with different name 13:48:20 currently working on opnfv-scenario-requirements.yml 13:49:03 ran into nested virt issues then 13:49:18 that's all 13:49:28 thanks jmorgan1, electrocucaracha 13:49:40 #topic os-nosdn-osm 13:49:59 the scenario is merged but hit slave issues with the patch integrating the scenario 13:50:12 should be done soon once we get slaves back 13:50:24 #topic os-odl-osm_sfc 13:50:42 i'll try to get this started and then pass it to mbuil 13:51:00 we will also meet with osm guys regarding the next steps 13:51:15 like what should we test and progress with opensuse support 13:51:21 fdegir: this will need to happen in H release probably. No cycles unless somebody helps :P 13:51:25 if anyone is interested joining, ping me 13:51:31 mbuil: that's fine 13:51:47 cause we really don't know how we can test the basic osm yet 13:52:00 so if we can get that pieces for os-nosdn-osm during G-release 13:52:03 i'd be happy 13:52:33 #topic k8-odl-coe 13:52:41 i don't see Peri online so moving on 13:52:53 #topic k8-nosdn-istio 13:53:14 hw_wutianwei_ might be away as well 13:53:46 #topic Testing Objectives 13:53:57 jmorgan1: let's talk about this 13:54:17 for patch verification, we run things in VMs 13:54:36 we do deployment using mini flavor and run old version of functest healthcheck 13:55:12 the reason of running things in VMs (nested virt) is to ensure we always start with clean VM to prevent environment related issues causing unnecessary failures 13:55:20 and utilize slaves more 13:55:36 the next testing is during post merge 13:55:49 the idea is to run functest smoke for merged patches 13:55:55 ok, let me explain 13:56:00 and then we take the scenario to baremetal with full functest and yardstick 13:56:06 jmorgan1: ok 13:56:21 i think i got my answer wiht our discussion the last two days 13:56:42 we have nested virtualization as you pointed out, and baremetal that mbuil is working on 13:56:59 no one is doing non-nested VM testing 13:57:25 i was curious about what others are doing when we first started getting nested virt errors 13:57:33 others meaning? 13:57:40 the xci teeam 13:57:58 i think it might be a good use case in the future to have non-nested vm support 13:58:00 i do the same 13:58:07 use ubuntu_xci_vm 13:58:15 but this doesn't mean you have to 13:58:24 you can directly execute xci-deploy.sh 13:58:36 it should be better now given that all the ironic/bifrost stuff is pushed into the opnfv vm 13:59:47 yup, i'll look into it when i have time, thanks 13:59:58 ok 14:00:06 #topic Multi-distro Support for New Scenarios 14:00:23 so we want to support all the distros we have 14:00:27 but it is not always possible 14:00:35 what is the expectation here? just curious as a new scenario owner 14:00:50 i say start with one distro but develop the scenario in a way that new distro support can easily be introduced 14:01:10 and if you find time, implement support for other distros as well 14:01:41 how do we know which distro is supported? been tested? 14:02:04 ubuntu and opensuse are supported for most of the distros 14:02:13 centos is kind of tricky since it is broken upstream osa 14:02:26 so you can start with opensuse or ubuntu 14:03:35 if you are basing your scenario to an existing one, opnfv-scenario-requirements.yml tells you which distros are supported for that scenario 14:04:04 we are over time 14:04:15 #topic XCI nodes in LF Portland lab 14:04:26 jmorgan1: what about these nodes? 14:04:39 i added three new systems for xci 14:04:55 it says "LF Portland Lab" 14:04:55 i was thinking of using them as develop systems 14:05:03 I suppose it should be Intel Portland Lab, or? 14:05:17 but we might want to use them for CI only 14:05:25 no, not correct 14:05:40 they are in the LF lab, with LF3,4,5 14:05:47 oh 14:05:52 ok 14:05:55 thus, LF Portland lab ;) 14:06:18 which pod are these nodes added? 14:06:32 no pod, they are their own 14:06:38 ok 14:06:52 each systems has one of thesupported OS 14:07:11 anyway, lets think about how to use 14:07:18 yes 14:07:24 either develop hosts or for CI 14:07:31 we can talk about that separately 14:07:32 CI only i mean 14:07:41 we need to go through the resource situation soon anyway 14:07:44 i wnated to share that we have them 14:08:33 thanks for that 14:08:47 lets move to the last topic 14:09:03 #topic XCI Developer Guide 14:09:11 jmorgan1: thanks for looking into it 14:09:39 jmorgan1: anything else you want to say? 14:09:48 its empty ;) 14:10:04 i don't mind starting to work on it, but I'll need some help 14:10:16 as the least knowledgable person 14:10:17 jmorgan1: I think Tapio started working on it 14:10:22 can you take a look at it 14:10:24 good to know 14:10:28 and perhaps incorporate the comments I posted 14:10:40 do you have link? 14:10:51 https://gerrit.opnfv.org/gerrit/#/c/51445/ 14:10:59 please feel free to amend 14:11:04 thanks 14:11:27 np 14:11:43 then we end the meeting here if there is nothing more 14:11:50 thank you all for joinin 14:11:53 thank you too 14:11:58 thanks 14:11:59 and welcome back to the ones who had holiday 14:12:10 and have a nice holiday to the ones that are going soon 14:12:14 talk to you in 2 weeks 14:12:16 #endmeeting