[FOLIO-2732] bogus 1-minute-long "build folio platform" builds Created: 18/Aug/20 Updated: 25/Aug/20 Resolved: 25/Aug/20 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Task | Priority: | P2 |
| Reporter: | Zak Burke | Assignee: | John Malconian |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Attachments: |
|
| Sprint: | DevOps: Sprint 95, DevOps: Sprint 96 |
| Development Team: | FOLIO DevOps |
| Description |
|
On multiple occasions over the last several weeks, the nightly builds for for folio-snapshot and folio-testing have cruised through "Build FOLIO platform" step in only 1 minute when it normally takes ~25 and reported success even though the resulting build was in fact unusable. Logs:
|
| Comments |
| Comment by David Crossley [ 19/Aug/20 ] |
|
One clue is that this seems to not be a problem for the "folio-testing-core-backend" and "folio-snapshot-core" builds. |
| Comment by John Malconian [ 25/Aug/20 ] |
|
The issue impacts all reference builds that are launched on ec2 instances. In the pipeline, we looked for an ec2 instance with a certain tag. If that tag is found we terminate the existing instance before launching a new instance via the Ansible ec2 module. The Ansible ec2 module launches a new instance only if an existing instance with the same tag is not running. So sometimes a new instance is not launched because it detects an existing instance that is still in the process of shutting down. No failure is returned by Ansible (by design) and the playbook continues. I've made the following modifications to counteract this condition: 1. Increase sleep time from 60 seconds to 120 seconds after terminating existing instance to ensure previous instance has been completely terminated. Closing for now. If this continues to be a problem, we can reopen. |