[FOLIO-3091] Vagrant builds time out in tenant init Created: 24/Mar/21 Updated: 24/Apr/21 Resolved: 24/Apr/21 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Bug | Priority: | TBD |
| Reporter: | Wayne Schneider | Assignee: | John Malconian |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||||||||||||||
| Sprint: | DevOps Sprint 110, DevOps Sprint 111, DevOps Sprint 112 | ||||||||||||||||||||||||||||
| Development Team: | FOLIO DevOps | ||||||||||||||||||||||||||||
| Description |
|
Error message: snapshot: fatal: [default]: FAILED! => {"changed": false, "content": "", "elapsed": 900, "msg": "Status code was -1 and not [200]: Connection failure: timed out", "redirected": false, "status": -1, "url": "http://10.0.2.15:9130/_/proxy/tenants/diku/install?deploy=true&tenantParameters=loadReference%3Dtrue%2CloadSample%3Dtrue"}
|
| Comments |
| Comment by Wayne Schneider [ 24/Mar/21 ] |
|
In a test build, Okapi was OOM-killed. |
| Comment by Wayne Schneider [ 24/Mar/21 ] |
|
Increasing the VM size to 20GB allows the build to complete. Testing now with packer/Jenkins. |
| Comment by Wayne Schneider [ 24/Mar/21 ] |
|
Even with 20GB the tenant init times out. More testing required. |
| Comment by Wayne Schneider [ 29/Mar/21 ] |
|
This issue has now cropped up with the testing-backend build: testing-backend: fatal: [default]: FAILED! => {"changed": false, "content": "", "elapsed": 900, "msg": "Status code was -1 and not [200]: Connection failure: timed out", "redirected": false, "status": -1, "url": "http://10.0.2.15:9130/_/proxy/tenants/diku/install?deploy=true&tenantParameters=loadSample%3Dtrue%2CloadReference%3Dtrue"} |
| Comment by Wayne Schneider [ 30/Mar/21 ] |
|
In testing, it appears that mod-search is flipping out and never returning from the tenant init call. Why this is not happening with the AWS builds is a bit mysterious. |
| Comment by Wayne Schneider [ 30/Mar/21 ] |
|
Investigation suggests that mod-search cannot return from the tenant init call until mod-authtoken is enabled for the tenant and the mod-search system user can log in and return a token. A couple of possible considerations:
|
| Comment by Wayne Schneider [ 31/Mar/21 ] |
|
The issue only comes up if there are messages in Kafka (as, for example, when inventory data are created with the loadSample=true tenant parameter). The module seems to go into a tight loop trying to get a token from Okapi, and only breaks out of it when mod-authtoken is finally initialized. It seems like the loop consumes all available cycles and the module is not able to return from tenant initialization. |
| Comment by Wayne Schneider [ 01/Apr/21 ] |
|
Raised
|
| Comment by Wayne Schneider [ 02/Apr/21 ] |
|
This is not the whole explanation however, as mod-search is not part of the testing-backend and testing Vagrant builds. |
| Comment by Wayne Schneider [ 02/Apr/21 ] |
|
In the testing-backend build, the tenant init eventually succeeds, but it takes more than 15 minutes, so the Ansible play times out! |
| Comment by Wayne Schneider [ 02/Apr/21 ] |
|
Updated timeout setting for tenant init. There are still issues with mod-search, however, so holding this open until those can be resolved. |
| Comment by Wayne Schneider [ 06/Apr/21 ] |
|
Until
|
| Comment by Wayne Schneider [ 20/Apr/21 ] |
|
John Malconian I think this is the issue you are now working on, so reassigning to you. |
| Comment by Wayne Schneider [ 24/Apr/21 ] |
|
This appears to be resolved for the folio/snapshot Vagrant build. The folio/testing and folio/testing-backend builds have a different issue, now (
Thanks John Malconian! |