[FOLIO-2649] Rancher pipelines fail on Build Docker step on folio-eks-1-us-west-2 Created: 17/Jun/20  Updated: 17/Jun/20  Resolved: 17/Jun/20

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Bug Priority: P2
Reporter: John Malconian Assignee: John Malconian
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Relates
relates to FOLIO-2650 Provision new EKS cluster to replace ... Closed
Sprint: DevOps: sprint 90
Development Team: FOLIO DevOps

 Description   

Communication between the Rancher Jenkins slave/agent pod to the Rancher Jenkins server pod is disrupted and the Rancher pipeline fails when the Build Docker step is invoked.



 Comments   
Comment by John Malconian [ 17/Jun/20 ]

Before this step occurs in the pipeline, communication between jenkins build agent and the jenkins server is fine. Just so happens that the VPC of the K8s cluster runs in is assigned 172.17.0.0/16. It also just so happens that Docker assigns its docker0 bridge an IP from the very same 172.17 class B address by default. Modern docker versions are smart enough, however, that if it detects that 172.17 is already in use by another interface, it assigns a different class B address. For example, when I enabled the docker bridge (docker0 interface) on the worker node hosts as a step in troubleshooting this issue, it assigned it an IP from 172.18 so it doesn't conflict. However, when the Rancher Docker-in-Docker container is invoked to build a docker image in the build agent pod, it's not so smart and creates a docker0 interface in the pod and assigns it a 172.17 address. At this point, routing from build agent pod to any other K8s pod is borked.

It is not possible to change the cidr block for the AWS VPC, or at least not easily where the cluster can utilize it or the configuration of the Rancher D-in-D container, so a new cluster on a new VPC needs to be created to resolve this. Will create separate Jira issue for that.

Generated at Thu Feb 08 23:22:14 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.